Bug #658
Ensure LDAP is started before services using it
100%
Description
For example PHP-FPM is started too early on Toushirou. We need to list all the affected services,
As it is specific to our use of LDAP account for certain services, I think a DC-specific service file distributed by the dc-ldap role could ensure all affected services wait until slapd is up (on LDAP servers only).
History
Updated by Marc Dequènes 7 months ago
PHP-FPM needs to wait for LDAP:
Aug 18 05:55:50 Toushirou php-fpm7.3[2175]: [18-Aug-2020 05:55:50] ERROR: [pool albums.georgesleyeti.fr] cannot get uid for user 'georgesleyeti' Aug 18 05:55:50 Toushirou php-fpm7.3[2175]: [18-Aug-2020 05:55:50] ERROR: FPM initialization failed Aug 18 05:55:50 Toushirou systemd[1]: php7.3-fpm.service: Main process exited, code=exited, status=78/CONFIG Aug 18 05:55:50 Toushirou systemd[1]: php7.3-fpm.service: Failed with result 'exit-code'.
Updated by Marc Dequènes 7 months ago
One more:
Aug 18 14:21:45 Elwing systemd[5160]: duck-calibre-server.service: Failed to determine user credentials: No such process Aug 18 14:21:45 Elwing systemd[5160]: duck-calibre-server.service: Failed at step USER spawning /usr/bin/calibre-server: No such process Aug 18 14:21:45 Elwing systemd[1]: duck-calibre-server.service: Main process exited, code=exited, status=217/USER Aug 18 14:21:45 Elwing systemd[1]: duck-calibre-server.service: Failed with result 'exit-code'.
Updated by Marc Dequènes 7 months ago
- Status changed from New to Resolved
- Assignee set to Marc Dequènes
- % Done changed from 0 to 100
There was a fix for duck-calibre-server.service but that was not sufficient; it should be ok now.
I also added a fix in the httpd_php_fpm role.
Since we rebooted all machines today, that should be all.
Updated by Pierre-Louis Bonicoli 25 days ago
This issue occurred yesterday on Toushirou
:
1. slapd
package has been updated by unattended-upgrade
(from /var/log/unattended-upgrades/unattended-upgrades-dpkg.log
)
Log started: 2021-02-04 06:18:23 apt-listchanges: Reading changelogs... Preconfiguring packages ... [...] Unpacking slapd (2.4.47+dfsg-3+deb10u5) over (2.4.47+dfsg-3+deb10u4) ... [...] Setting up slapd (2.4.47+dfsg-3+deb10u5) ... Backing up /etc/ldap/slapd.d in /var/backups/slapd-2.4.47+dfsg-3+deb10u4... done. Processing triggers for systemd (241-7~deb10u5) ... Processing triggers for man-db (2.8.5-2) ... Processing triggers for libc-bin (2.28-10) ... [...] Restarting services... systemctl restart apache2.service clamav-freshclam.service dovecot.service matrix-synapse.service nslcd.service php7.3-fpm.service postfix@-.service proftpd.service tt-rss.service zabbix-agent.service Job for php7.3-fpm.service failed because the control process exited with error code. See "systemctl status php7.3-fpm.service" and "journalctl -xe" for details. [...]
2.
php-fpm
statusjournalctl -u php7.3-fpm.service -- Logs begin at Tue 2021-02-02 18:55:52 CET, end at Thu 2021-02-04 15:27:15 CET. -- Feb 04 06:18:36 Toushirou systemd[1]: Stopping The PHP 7.3 FastCGI Process Manager... Feb 04 06:18:36 Toushirou systemd[1]: php7.3-fpm.service: Succeeded. Feb 04 06:18:36 Toushirou systemd[1]: Stopped The PHP 7.3 FastCGI Process Manager. Feb 04 06:18:36 Toushirou systemd[1]: php7.3-fpm.service: Consumed 13h 12min 20.075s CPU time. Feb 04 06:18:36 Toushirou systemd[1]: Starting The PHP 7.3 FastCGI Process Manager... Feb 04 06:18:37 Toushirou php-fpm7.3[12259]: [04-Feb-2021 06:18:37] ERROR: [pool albums.georgesleyeti.fr] cannot get uid for user 'georgesleyeti' Feb 04 06:18:37 Toushirou php-fpm7.3[12259]: [04-Feb-2021 06:18:37] ERROR: FPM initialization failed Feb 04 06:18:37 Toushirou systemd[1]: php7.3-fpm.service: Main process exited, code=exited, status=78/CONFIG Feb 04 06:18:37 Toushirou systemd[1]: php7.3-fpm.service: Failed with result 'exit-code'. Feb 04 06:18:37 Toushirou systemd[1]: Failed to start The PHP 7.3 FastCGI Process Manager. Feb 04 06:18:37 Toushirou systemd[1]: php7.3-fpm.service: Consumed 62ms CPU time.
3.
slapd
logsjournalctl -u slapd.service -- Logs begin at Tue 2021-02-02 18:55:52 CET, end at Thu 2021-02-04 15:55:26 CET. -- Feb 04 06:18:30 Toushirou systemd[1]: Stopping LSB: OpenLDAP standalone server (Lightweight Directory Access Protocol)... Feb 04 06:18:30 Toushirou slapd[841]: daemon: shutdown requested and initiated. Feb 04 06:18:30 Toushirou slapd[841]: slapd shutdown: waiting for 2 operations/tasks to finish Feb 04 06:18:30 Toushirou slapd[841]: DIGEST-MD5 common mech free Feb 04 06:18:30 Toushirou slapd[841]: DIGEST-MD5 common mech free Feb 04 06:18:30 Toushirou slapd[841]: slapd stopped. Feb 04 06:18:30 Toushirou slapd[11005]: Stopping OpenLDAP: slapd. Feb 04 06:18:30 Toushirou systemd[1]: slapd.service: Succeeded. Feb 04 06:18:30 Toushirou systemd[1]: Stopped LSB: OpenLDAP standalone server (Lightweight Directory Access Protocol). Feb 04 06:18:30 Toushirou systemd[1]: slapd.service: Consumed 4h 34min 19.908s CPU time. Feb 04 06:18:30 Toushirou systemd[1]: Starting LSB: OpenLDAP standalone server (Lightweight Directory Access Protocol)... Feb 04 06:18:30 Toushirou slapd[11016]: @(#) $OpenLDAP: slapd (Jan 22 2021 03:54:40) $ Debian OpenLDAP Maintainers <pkg-openldap-devel@lists.alioth.debian.org> Feb 04 06:18:30 Toushirou slapd[11017]: slapd starting Feb 04 06:18:30 Toushirou slapd[11011]: Starting OpenLDAP: slapd. Feb 04 06:18:30 Toushirou systemd[1]: Started LSB: OpenLDAP standalone server (Lightweight Directory Access Protocol). Feb 04 06:18:30 Toushirou slapd[11017]: do_syncrep2: rid=004 LDAP_RES_INTERMEDIATE - REFRESH_DELETE Feb 04 06:18:30 Toushirou slapd[11017]: do_syncrep2: rid=002 LDAP_RES_INTERMEDIATE - REFRESH_DELETE Feb 04 06:31:28 Toushirou slapd[11017]: do_syncrep2: rid=002 (-1) Can't contact LDAP server Feb 04 06:31:28 Toushirou slapd[11017]: do_syncrep2: rid=004 (-1) Can't contact LDAP server Feb 04 06:31:28 Toushirou slapd[11017]: do_syncrepl: rid=002 rc -1 retrying (2 retries left) Feb 04 06:31:28 Toushirou slapd[11017]: do_syncrepl: rid=004 rc -1 retrying (2 retries left) Feb 04 06:31:38 Toushirou slapd[11017]: do_syncrep2: rid=002 LDAP_RES_INTERMEDIATE - REFRESH_DELETE Feb 04 06:31:38 Toushirou slapd[11017]: do_syncrep2: rid=004 LDAP_RES_INTERMEDIATE - REFRESH_DELETE
Marc Dequènes should not this issue be reopened ?