Showing posts with label certificate. Show all posts
Showing posts with label certificate. Show all posts

Friday, May 27, 2016

OpsMgr (SCOM) - Unix/Linux Agent Powershell DSC

Who doesn't have any Unix/Linux agent deployment issue ?

Remebering this earlier post i made about "OpsMgr (SCOM) - Unix/Linux Agents Requisites and Troubleshooting"
I came up with the idea to make a script that made this validations for ourselves.

Basically it'll login your unix/linux servers with your own credentials and will make a bunch of configurations tests.
But, please remember that this is my own scenario oriented, so, read and edit the code for your own scenario.

So, in first place, you'll need this library :
http://www.powershelladmin.com/wiki/SSH_from_PowerShell_using_the_SSH.NET_library

You can put it on your Modules favourite folder (eg. C:\Program Files\WindowsPowerShell\Modules)

Finally!
You can execute this script from your Unix/Linux Resource Pool gateway or MS:

 $ServerList = 'C:\Powershell\SCXAgentDSC\list.txt'  
 $SCXAgents = Get-Content -Path $ServerList  
   
 # Change values for your own  
 $user = 'Your_Run_AsAccountGoesHere!'   
 $pass = ConvertTo-SecureString 'YourPassword' -AsPlainText -Force   
 $creds = New-Object System.Management.Automation.PsCredential($user,$pass)  
   
 try { Import-Module SSH-Sessions }  
 Catch { 'No SSH Modules Found' ; Exit }  
 foreach ( $scxagent in $SCXAgents ) {  
   $scxdomain = ($scxagent -split "\.")[-2..-1] -join '.'  
     # Change values for your own here as well   
   if( (New-SshSession -ComputerName $scxagent -Username Your_Run_AsAccountGoesHere -Password "YourPassword") -notmatch “successfully”) {  
     $scxagent + ' Could not SSH (bad user / password ? | Or no route ? )'  
     $SSHStatus = "1"  
   } Else { $SSHStatus = "0" }  
   If ($SSHStatus -eq "0" ) {  
     Invoke-SshCommand -Quiet -ComputerName $scxagent -Command "sudo -l" | Out-File C:\Powershell\sudo.txt  
     Invoke-SshCommand -Quiet -ComputerName $scxagent -Command "cat /etc/issue" | Out-File C:\Powershell\issue.txt  
     Invoke-SshCommand -Quiet -ComputerName $scxagent -Command "openssl x509 -noout -in /etc/opt/microsoft/scx/ssl/scx.pem -subject -issuer -dates" | Out-File C:\Powershell\certconfig.txt  
         # This is only applied if you have limited sudo configuration   
         # This line will check if the sudo config escapes the EC (error code) variable   
     $ECCount = (Get-Content C:\Powershell\sudo.txt | select-string -SimpleMatch "EC\=0" | measure).Count  
         # This will check if you have enought perms for RPM install and uninstall  
     $RPMLines = Get-Content C:\Powershell\sudo.txt | select-string -SimpleMatch "--force /tmp/scx-monuser/scx"  
         # This will check if you have root permissions (and no further sudo config is needed - so comment the lines that does not match your scenario)  
         $SUDOALL = Get-Content C:\Powershell\sudo.txt | select-string -SimpleMatch "(root) NOPASSWD: ALL"  
         # This will check if you can re-generate certificates if needed  
     $SSLConfig = Get-Content C:\Powershell\sudo.txt | select-string -SimpleMatch "/opt/microsoft/scx/bin/tools/scxsslconfig"  
         # This will check if you have a certificate and for the correct FQDN  
     $CertConfig = Get-Content C:\Powershell\certconfig.txt | select-string -SimpleMatch "$scxagent"  
     $SCXSSLDomain = ((Get-Content C:\Powershell\certconfig.txt | Select-String -SimpleMatch "subject") -split "=")[-1]  
         # Port testing (22 and 1270)  
     Try { If ((new-object System.Net.Sockets.TcpClient("$scxagent","1270")).connected -eq $true ) { $AgentPortStatus = "OK" } Else { $AgentPortStatus = "NOT OK" } } Catch { $AgentPortStatus = "NOT OK"}  
     Try { If ((new-object System.Net.Sockets.TcpClient("$scxagent","22")).connected ) { $sshstatus = "OK"} Else { $sshstatus = "NOT OK" } } Catch { $sshstatus = "NOT OK" }  
         # WSMan Testing   
     If ( Test-WSMan -Port 1270 -ComputerName $scxagent -Authentication Basic -Credential $creds -UseSSL -ErrorAction SilentlyContinue ) { $wsmanstatus = 'OK' } Else { $wsmanstatus = 'NOT OK' }  
     $scxagenturi = "https://"+"$scxagent"+":1270/wsman"  
         # WinRM validation  
     Try { If ( winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_Agent?__cimnamespace=root/scx -username:'YOUR_USERNAME_HERE' -password:'YOUR_PASSWORD_HERE' -r:$scxagenturi -auth:basic -skipCACheck -skipCNCheck -skiprevocationcheck -encoding:utf-8 ) { $winrmstatus = "OK" } Else { $winrmstatus = "NOT OK" } } Catch { $winrmstatus = "NOT OK" }  
     If ( $ECCount -gt "0" )        { $ecstatus = "NOT OK" } Else { $ecstatus = "OK" }  
         If ( $SUDOALL )            { $sudoallstatus = "OK" } Else { $sudoallstatus = "NOT OK" }  
     If ( $RPMLines -match "[0-9]" )    { $rpmstatus = "NOT OK" } Else { $rpmstatus = "OK" }  
     If ( $SSLConfig -match "scxsslconfig" ) { $SSLConfigStatus = "OK" } Else { $SSLConfigStatus = "NOT OK" }  
     If ( $CertConfig -match "$scxagent" -and $CertConfig -match $scxdomain) { $CertificateStatus = "OK" } Else { $CertificateStatus = "NOT OK" }  
         # Remove the ones that not match your cenario (For sudo config)  
     Write-Output "$scxagent | WSMAN : $wsmanstatus | SSH : $sshstatus | AgentPort : $AgentPortStatus | EC SUDOConfig : $ecstatus | RPM SUDOConfig : $rpmstatus | SUDOAll : $sudoallstatus | SCXConfig SUDOConfig : $SSLConfigStatus | Certificate : $CertificateStatus | WinRM : $winrmstatus"  
   }  
 }  
 Remove-SshSession -RemoveAll | Out-Null  


Wednesday, March 30, 2016

OpsMgr (SCOM) - Unix/Linux Agents Requisites and Troubleshooting

UNIX/Linux Monitoring/Discovery in OpsMgr can be very hard to troubleshoot sometimes.
You can have several discovery issues, like:
- SSH connection erros;
- Certificate Issues;
- Network port issues;
- Bad SUDO permissions;
- And so on.

Since there's a lot of information spreaded in several blogs, but i've never founded a centralized source of troubleshooting steps about this thread, i've decided to create this post/thread to continually update it with found errors and related resolution.

First, check if you have any of this errors documented on TechNet :
http://social.technet.microsoft.com/wiki/contents/articles/4966.scom-2012-troubleshooting-unixlinux-agent-discovery.aspx

If not, you can read further :)

In first place i'll leave all the pre-requisites you'll need to have on your environment to make it work perfectly.

First, you need a user!
root or with SUDO permissions ?
If your UNIX/Linux SysAdmin wants to limit your 'sudoers' file, this is what you need :
(root) NOPASSWD: /bin/sh -c cp /tmp/scx-monuser/scx.pem /etc/opt/microsoft/scx/ssl/scx.pem; rm -rf /tmp/scx-monuser; /opt/microsoft/scx/bin/tools/scxadmin -restart  
(root) NOPASSWD: /bin/sh -c sh /tmp/scx-monuser/GetOSVersion.sh; EC\=$?; rm -rf /tmp/scx-monuser; exit $EC  
(root) NOPASSWD: /bin/sh -c cat /etc/opt/microsoft/scx/ssl/scx.pem  
(root) NOPASSWD: /bin/sh -c rpm -e scx  
(root) NOPASSWD: /bin/sh -c /bin/rpm -F --force /tmp/scx-monuser/scx-*.rpm; EC\=$?; cd /tmp; rm -rf /tmp/scx-monuser; exit $EC  
(root) NOPASSWD: /bin/sh -c /bin/rpm -U --force /tmp/scx-monuser/scx-*.rpm; EC\=$?; cd /tmp; rm -rf /tmp/scx-monuser; exit $EC  
(root) NOPASSWD: /opt/microsoft/scx/bin/scxlogfilereader -p  
# I've added this so you can re-generate certificates if you need to  
(root) NOPASSWD: /opt/microsoft/scx/bin/tools/scxsslconfig *  

If not, you'll just need :
monuser ALL=(ALL) NOPASSWD: ALL
Remind that you need to comment out the line :
#Defaults !requiretty
so it needs to be as :
Defaults !requiretty

Next, you need to ensure you've TCP/IP port connection from your UNIX/Linux Resource Pool Servers to your UNIX/Linux servers on ports 22 and 1270.

Other thing you might need to is to re-generate your SCXAgent certificate.
Some companies have 2 different FQDN's for the same server so it can respond in a different network device (management network device instead of the service network device), so if you're discovering your server by that particular management FQDN, the certificate needs to be generated with the FQDN you're discovering the server with.
Eg.
You have server myserver.mydomain.com to discover
The management FQDN is myserver.mymngtdomain.com
You'll be discoverying your server by myserver.mymngtdomain.com
So, you need to ensure that the certificate is generated to it.
Login by SSH into myserver.mymngtdomain.com and run :
openssl x509 -noout -in /etc/opt/microsoft/scx/ssl/scx.pem -subject -issuer -dates  
If it's not the FQDN you want, run :
sudo /opt/microsoft/scx/bin/tools/scxsslconfig -f -h myserver -d mymngtdomain.com
So, the errors you can come across with are :

1) WinRM cannot complete the operation.
    Verify that the specified computer name is valid, that the computer is accessible over the network, and that a firewall exception for the WinRM service is enable and allows access from this computer.
   By default, the WinRM firewall exception for public profiles limits access to remote computers within the same local subnet.
2) Agent verification failed. Error detail: The server certificate on the destination computer (SERVER_FQDN:1270) has the following errors
3) DNS Configuration error:
    The provided hostname SERVER_FQDN resolved to the IP address of x.x.x.x.
    The hostname SERVER_FQDN returned by reverse lookup of the IP address x.x.x.x did not match the provided hostname.
    Verify the DNS configuration and try the request again.
4) sudo: no tty present and no askpass program specified
5) The agent responded to the request but the WSMan connection failed due to: Access is Denied.”

# 1) WinRM cannot complete the operation.

Verify that your FQND is correct;
You have TCP/IP connection with your server in 22 and 1270 ports;
You might use this PS1 script from one of your UNIX/Linux Resource Pool servers to check :
$list = Get-Content -path 'Path_to_ServerList'
Foreach ($server in $list) {
    $SSHStatus = (new-object System.Net.Sockets.TcpClient("$server","22")).connected
    $MNGTStatus = (new-object System.Net.Sockets.TcpClient("$server","1270")).connected
    "$server | $SSHStatus | $MNGTStatus"
}
# 2) Agent verification failed. Error detail: The server certificate on the destination computer (SERVER_FQDN:1270) has the following errors

The certificate is not compliant with the FQDN you're discoverying the server with.
For example :
You have server myserver.mydomain.com to discover
The management FQDN is myserver.mymngtdomain.com
You'll be discoverying your server by myserver.mymngtdomain.com
So, you need to ensure that the certificate is generated to it.
Login by SSH into myserver.mymngtdomain.com and run :
openssl x509 -noout -in /etc/opt/microsoft/scx/ssl/scx.pem -subject -issuer -dates
If it's not the FQDN you want, run :
sudo /opt/microsoft/scx/bin/tools/scxsslconfig -f -h myserver -d mymngtdomain.com
If you've several UNIX/Linux servers in this condition, use this ShellScript i've made to correct it :
(I personal use MobaXTerm to do such things)
#!/bin/sh
for i in `cat list`
do
    a=`echo $i | cut -d"." -f1`
    ssh monuser@$i "sudo /opt/microsoft/scx/bin/tools/scxsslconfig -f -h $a -d mymngtdomain.com"
done
# 3) DNS Configuration error.

You might need to correct name resolution configuration for forward and reverse lookup on your DNS server.

# 4) sudo: no tty present and no askpass program specified

Well, this is a sudo problem.
Check if you have either this in (/etc/sudoers) :
monuser ALL=(ALL) NOPASSWD: ALL
or this in your sudoers file:
(root) NOPASSWD: /bin/sh -c cp /tmp/scx-monuser/scx.pem /etc/opt/microsoft/scx/ssl/scx.pem; rm -rf /tmp/scx-monuser; /opt/microsoft/scx/bin/tools/scxadmin -restart
(root) NOPASSWD: /bin/sh -c sh /tmp/scx-monuser/GetOSVersion.sh; EC\=$?; rm -rf /tmp/scx-monuser; exit $EC
(root) NOPASSWD: /bin/sh -c cat /etc/opt/microsoft/scx/ssl/scx.pem
(root) NOPASSWD: /bin/sh -c rpm -e scx
(root) NOPASSWD: /bin/sh -c /bin/rpm -F --force /tmp/scx-monuser/scx-*.rpm; EC\=$?; cd /tmp; rm -rf /tmp/scx-monuser; exit $EC
(root) NOPASSWD: /bin/sh -c /bin/rpm -U --force /tmp/scx-monuser/scx-*.rpm; EC\=$?; cd /tmp; rm -rf /tmp/scx-monuser; exit $EC
(root) NOPASSWD: /opt/microsoft/scx/bin/scxlogfilereader -p
# I've added this so you can re-generate certificates if you need to
(root) NOPASSWD: /opt/microsoft/scx/bin/tools/scxsslconfig *
and if you've this line commented out in (/etc/sudoers) :
#Defaults !requiretty
# 5) The agent responded to the request but the WSMan connection failed due to: Access is Denied.”

  If you get the "The agent responded to the request but the WSMan connection failed due to: Access is Denied.”" error, first, from one of your UNIX/Linux Resource Pool, run :
Test-WSMan -Port 1270 -ComputerName “ServerName” -Authentication Basic -Credential (Get-Credential) -UseSSL
If you got an error, you might need to edit your pam file (/etc/pam.d/omi) and leave it like this :
omi auth sufficient pam_vas3.so create_homedir get_nonvas_pass store_creds try_first_pass
omi auth requisite pam_vas3.so echo_return
omi auth required /usr/lib/security/pam_aix use_new_state use_first_pass
omi account required /usr/lib/security/pam_seos.o
omi account sufficient pam_vas3.so
omi account requisite pam_vas3.so echo_return
omi account required /usr/lib/security/pam_aix
Re-discover your agent, and you'll get it working!