OpenSAF / Tickets / #3359 ntf: fix ntfd coredump

ntf: fix ntfd coredump

#3359 ntf: fix ntfd coredump

Milestone: 5.26.02

Status: fixed

Owner: Khuong Ba Le

Labels: None

Type: defect

Component: ntf

Part: d

Version:

Priority: minor

Blocker: False

Updated: 2026-02-27

Created: 2024-09-05

Creator: Khuong Ba Le

Private: No

Coredump happens when try to access to a null pointer in ntfd.

BT ###
#0 0x0000558e5506a000 in NtfClient::getMdsDest() const ()
#1 0x0000558e55072213 in NtfAdmin::processNotification(unsigned int, SaNtfNotificationTypeT, ntfsv_send_not_req*, mds_sync_snd_ctxt*, unsigned long long) ()
#2 0x0000558e55072487 in NtfAdmin::notificationReceived(unsigned int, SaNtfNotificationTypeT, ntfsv_send_not_req*, mds_sync_snd_ctxt*) ()
#3 0x0000558e5505805f in ?? ()
#4 0x0000558e55058844 in ?? ()
#5 0x0000558e55058c2b in ntfs_process_mbx ()
#6 0x0000558e55056637 in main ()

Step Reproduce:
1. Simulate nfs hang to make the ntf buffer full.
2. Client send many notifications to server.
3. Reboot all client to simulate client down while sending ntf.
* Recommend: using a cluster with multiple node payloads to simulate numerous clients sending notifications simultaneously. This will increase the likelihood of overload, thereby raising the chances of encountering a core dump.

Description has changed:

Diff:

--- old
+++ new
@@ -1,4 +1,4 @@
-Coredump happen when race condition between remove client thread and get client thread in ntfd
+Coredump happens when try to access to a null pointer in ntfd.
 ~~~
 BT ###
 #0 0x0000558e5506a000 in NtfClient::getMdsDest() const ()
@@ -10,6 +10,7 @@
 #6 0x0000558e55056637 in main ()
 ~~~
 Step Reproduce:
-Step 1: send many notification from remote note to node active
-Step 2: toggling SIGSTOP/SIGCONT osaflogd process on node active, ntf&#39;s buffer will be full.
-Step 3: reboot remote node.
+    1. Simulate nfs hang to make the ntf buffer full.
+    2. Client send many notifications to server.
+    3. Reboot all client to simulate client down while sending ntf.
+  * Recommend: using a cluster with multiple node payloads to simulate numerous clients sending notifications simultaneously. This will increase the likelihood of overload, thereby raising the chances of encountering a core dump.

Gary Lee - 2025-04-26

Milestone: 5.25.03 --> 5.25.04
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Gary Lee - 2025-04-26

Milestone: 5.25.04 --> future
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Khuong Ba Le - 2026-01-07

status: assigned --> review

Milestone: future --> 5.25.09
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Thien Minh Huynh - 2026-01-07

status: review --> fixed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Thien Minh Huynh - 2026-01-07

commit eb24877a84905aa5d72b9551764c271c8b3bd24e (HEAD -> develop, origin/develop)
Author: khuonglb khuong.b.le@endava.com
Date: Wed Oct 1 15:51:31 2025 +0700

ntfd: prevent coredump on null client during notification [#3359] ntfd could crash when a notification was processed after client was removed. Leaving a null client pointer and causing a null pointer dereference Check for a null client before handling notifications and return early.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ntf: fix ntfd coredump

Milestone

Searches

Help

#3359 ntf: fix ntfd coredump

Related

Discussion