Tip
Check out the repository on GitHub
Check out the demo at: demo.webui.ansibleguy.net |
Login: User demo
, Password Ansible1337
Warning
DISCLAIMER: This is an unofficial community project! Do not confuse it with the vanilla Ansible product!
Troubleshooting¶
Topology¶
AnsibleGuy WebUI is made of a few main components.
It will be beneficial for the troubleshooting process if we find out in which the error occurs.
Debugging¶
You can enable the debug mode at the System - Config
page.
If that is not possible you can alternatively set the AW_DEBUG
environmental variable.
This debug mode SHOULD ONLY BE ENABLED TEMPORARILY! It could possibly open attack vectors.
You might need to restart the application to apply this setting.
Versions¶
You can find the versions of software packages in use at the System - Environment
page.
Alternatively you can check it from the cli: python3 -m ansibleguy-webui.cli --version
Job Execution¶
If you want to troubleshoot a job execution, you will have to find out if it is an issue with Ansible or the WebUI system.
The Ansible execution itself can fail because of some common issues:
Unable to connect
Network issue
Wrong credentials supplied
Target system is mis-configured
Controller dependencies
Ansible needs Python Modules and in some cases Ansible Collections and Ansible Roles to function correctly
These need to be installed and should be up-to-date.
You can find the current versions used by your Controller system at the
System - Environment
pageIf you are using Docker - you can install those dependencies using requirements-files. See Usage - Docker
to be continued..
Common Issues¶
SSH Hostkey Verification¶
Error: While executing Ansible you see: Host key verification failed
Problem:
SSH has a security feature that should keep you safe from man-in-the-middle attacks which could allow the attacker to take over your SSH account/credentials.
See also: Ansible Docs - Hostkey Verification
As this security feature is important you SHOULD NOT DISABLE IT IN PRODUCTION by adding the environmental variable ANSIBLE_HOST_KEY_CHECKING=False to your jobs!
In production you might want to either:
Maintain a list of known-good hostkeys
You can specify which
known_hosts
file AW should use, using the config settingAW_SSH_KNOWN_HOSTS
Implement CA signed-hostkeys
Python Module not installed¶
Error: While executing Ansible you see: No module named '<MODULE>'
Problem:
Your Ansible controller system is missing a required Python3 module!
If you are NOT using Docker, you can install it manually using PIP:
python3 -m pip install <MODULE>
You could also find and install the module using your systems package manager:
sudo apt install python3-<MODULE>
(NOTE: these packages are older versions)If you are using Docker, you can create and mount a
requirements.txt
and restart your container. See also: Usage - Docker
CSRF Failed¶
Error: After submitting a form you see: Forbidden (403) CSRF verification failed. Request aborted.
Problem:
The hostname you are using to access AW is probably not configured as/listed in
AW_HOSTNAMES
SAML Issues¶
To get more information - you can enable its logging by adding this block to the config file:
...
SAML:
LOGGING:
version: 1
formatters:
simple:
format: '[%(asctime)s] [%(levelname)s] [%(name)s.%(funcName)s] %(message)s'
handlers:
stdout:
class: 'logging.StreamHandler'
stream: 'ext://sys.stdout'
level: 'DEBUG'
formatter: 'simple'
loggers:
saml2:
level: 'DEBUG'
root:
level: 'DEBUG'
handlers: ['stdout']
Note: The SAML config-file is only reloaded on restart.
Common errors you might encounter:
CSRF validation failed
- the ACS url may not be configured correctlyIf you see a page with an error-code - you can look-up their references here
Per example:
1107
means you supplied an invalid SAML configuration or thexmlsec
package is not installed1110
means you might need to check your IDPs metadata and modify theNAME_ID_FORMAT
setting1113
and1114
mean you have not or mis-configured your attribute mappings
Note: SAML testing has been done using the mocksaml.com service
Edge-Case Issues¶
Connection in use¶
Error: While starting AW you see: Connection in use: ('127.0.0.1', 8000)
Problem
Make sure no other process is binding to port 8000:
netstat -tulpn | grep 8000
If that is the case - you can set the
AW_PORT
env-var to change the port to be used.The app failed last time. There is still an old process running. If this happens repeatedly - open an issue!
You can find and kill it:
# find it pgrep -f ansibleguy-webui netstat -tulpn | grep 8000 ps -aux | grep ansibleguy-webui | grep -v grep # kill it pkill -f ansibleguy-webui kill -9 <PID>
Database is locked¶
Error: The Web interface shows a plain Error 500
and the console shows django.db.utils.OperationalError: database is locked
Problem:
I’ve encountered this issue a few times. It occurs because the SQLite database is locked by a write-operation.
Restarting the application is the easiest way of working around it.
If it occurs more often - please open an issue!
If you are running many jobs - you could try to keep a minute between their scheduled executions.
Too Many Log Files exist¶
Error: Job logs are currently not cleaned automatically. You may want to clean them manually periodically.
Resolution:
You can easily remove all log-files older than N days with this command:
MAX_LOG_AGE=7 # days
cd ~/.local/share/ansible-webui/
find -type f -mtime +${MAX_LOG_AGE} -name "*.log" -delete
Database Migration Issues¶
Note: This is a general guide on how to handle Django migration issues. It could also be helpful if you are running another Django app.
Error: After a version upgrade you see django.db.utils.OperationalError: no such column
or even django.db.utils.OperationalError: no such table
Problem:
It seems the database schema was not upgraded. This is normally done automatically at application startup.
You can try to execute the migrations manually:
Stop the application
Enter the application context & try to upgrade
# when running as local service-user su <SERVICE-USER> --login --shell /bin/bash # when running in docker docker exec -it ansible-webui /bin/sh # set the path to your database export AW_DB=<PATH-TO-YOUR-DB> # upgrade DB schema python3 -m ansibleguy-webui.manage migrate
Error: While running the database schema upgrade you see django.db.utils.OperationalError: duplicate column name
or django.db.utils.OperationalError: duplicate table name
Problem:
This should never happen if you are running a release version (AW_ENV=prod) and did not already run migrations manually.
Make sure you set the
AW_DB
env-var correctly before running the migrations.You will have to find out which migrations were already applied:
python3 -m ansibleguy-webui.manage showmigrations
Or check your database manually:
sqlite3 <PATH-TO-YOUR-DB> SELECT name,applied FROM django_migrations WHERE app = "aw";
You can also check the current schema of the table you see mentioned in the error message
sqlite3 <PATH-TO-YOUR-DB> .table .schema <TABLE>
Check which migrations are available:
python3 -m ansibleguy-webui.cli -a migrations.list
With that information you should be able to determine which migrations you can
fake
and which ones to apply.# migrations that are available and already are applied to the database - can be faked (only last one) python3 -m ansibleguy-webui.manage migrate --fake aw 0001_v0_0_12 # you should then be able to apply the un-applied migrations python3 -m ansibleguy-webui.manage migrate aw 0002_v0_0_13
Database Startup Issue¶
Error: While starting AW - you see the error sqlite3.DatabaseError: database disk image is malformed
Problem:
The service may have been force-terminated without being able to close the database connection gracefully.
You can try to re-/move the
aw.db-shm
andaw.db-wal
files that can be found in the same directory as your database-file.