[Linux-gpib-general] patch proposal

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Dear Michael and Dear David,

1)
here is a proposal for a patch to system hang in case of an off-line or
non existent device. Thanks to Michael for having spotted the problem.

When bb_write() is called, NDAC has to be already asserted low, or no
device is listening. This was not checked and made go crazy
bb_NRFD_interrupt().

In the patch, I have also moved to debug level 1 the (normally useless)
warnings for out of order or idle interrupts. But see also point 2.

--- gpib_bitbang.c-76c3dc    2024-07-27 23:35:01.412798005 +0200
+++ gpib_bitbang.c    2024-07-27 23:48:26.639818311 +0200
@@ -417,7 +417,7 @@
               int send_eoi, size_t *bytes_written)
  {
      unsigned long flags;
-        int retval = 0;
+        int retval = -1;

          bb_private_t *priv = board->private_data;

@@ -438,6 +438,7 @@

          dbg_printk(1,"Enabling interrupts - NRFD: %d   NDAC: %d\n",
                          gpiod_get_value(NRFD), gpiod_get_value(NDAC));
+        if (gpiod_get_value(NDAC)) goto write_end;

          spin_lock_irqsave (&priv->rw_lock, flags);
                  priv->w_busy = 1;          /* make the interrupt
routines active */
@@ -506,13 +507,13 @@
          if (priv->phase == 99)     ENABLE_IRQ (priv->irq_NRFD,
IRQ_TYPE_EDGE_RISING);

          if (priv->w_busy == 0) {
-                dbg_printk(0,"interrupt while idle after %zu/%zu at %d\n",
+                dbg_printk(1,"interrupt while idle after %zu/%zu at %d\n",
                                        priv->w_cnt, priv->length,
priv->phase);
          priv->nrfd_idle++;
                  goto nrfd_exit;  /* idle */
          }
          if (nrfd == 0) {
-                dbg_printk(0,"out of order interrupt after %zu/%zu at
%d cmd %d " LINFMT ".\n",
+                dbg_printk(1,"out of order interrupt after %zu/%zu at
%d cmd %d " LINFMT ".\n",
                                        priv->w_cnt, priv->length,
priv->phase, priv->cmd, LINVAL);
                  priv->phase = 3;
          priv->nrfd_seq++;
@@ -565,12 +566,12 @@
          irq, gpiod_get_value(NRFD), ndac, board->status,
priv->direction, priv->w_busy, priv->r_busy);

          if (priv->w_busy == 0) {
-                dbg_printk(0,"interrupt while idle.\n");
+                dbg_printk(1,"interrupt while idle.\n");
          priv->ndac_idle++;
                  goto ndac_exit;
          }
          if (ndac == 0) {
-                dbg_printk(0,"out of order interrupt at %zu:%d.\n",
priv->w_cnt, priv->phase);
+                dbg_printk(1,"out of order interrupt at %zu:%d.\n",
priv->w_cnt, priv->phase);
                  priv->phase = 5;
          priv->ndac_seq++;
                  goto ndac_exit;



2)
I see the problem of a buffer overflow while writing, but to reproduce
repeated interrupts with slow edges I had to use rising and falling times
longer than 100 us, a not common event, and only with RPi3b. RPi4 and
RPi5 have Schmitt triggers on input, and this makes the event impossible.
Yet, a good interlock between NDAC and NRFD interrupts is advisable, and
the current implementation (for historical reasons) is a mess. Sorry for
that and thanks again to Michael for signalling the problem. A revision
of this part of the code was already on schedule. Asap, as usual.

Bye

Marcello Carla'

On 7/28/24 21:17, Michael Schwingen wrote:
> On 26.07.24 17:48, marcello.carla via Linux-gpib-general wrote:
>> Dear Michael,
>>
>> bb_DAV_interrupt():
>>
>> the check for buffer overflow is at line 390 of current version
>> [76c3dc] (line 379 of last unpatched [b4cbd1]):
>>
>> priv->end_flag = ((priv->count >= priv->request) || priv->end);
>>
>> 'count' is the number of read character; 'request' is the buffer
>> length; when the buffer is full, the operation is terminated even
>> before an EOI or newline. Can you reproduce the conditions when
>> this mechanism does not work correctly?
>
> No, I currently can't reproduce it, but I am quite sure I had cases
> where the count incremented way beyond the expected transfer count.
>
> It might have been the write case:
>
> in bb_NRFD_interrupt, we have
>
>         set_data_lines(priv->w_buf[priv->w_cnt++]); // put the data on
> the lines
> with no check of the transfer size - it checks the size to assert EOI,
> but does not stop further interrupts from happening.
>
> The check to end the transfer is in bb_NDAC_interrupt.
>
> Now if you get lots of NRFD interrupts before NDAC interrupt is
> called, you will increment w_cnt without limit.
>
> Also, if you get multiple NRFD interrupts for one transfer (which can
> happen with sloppy rising/falling edges and reflections), you will get
> wrong data in the buffer. The NRFD interrupt should be locked out
> until the NDAC phase has happened (and vice-versa).
>
>
>>
>> system hang:
>>
>> yes, there is a problem; when you address a non existing device
>> with ibrd(), you correctly obtain a timeout error; when you try
>> an ibwrt() on a non existing device, the system hangs. I shall
>> try to spot the error and propose a remedy asap.
>
> The interesting thing is this only happens some of the time. i had to
> power-cycle the DMM about 5 times before I could catch the hang.
>
> cu
>
> Michael
>
>
>
> _______________________________________________
> Linux-gpib-general mailing list
> Lin...@li...
> https://lists.sourceforge.net/lists/listinfo/linux-gpib-general





[Linux-gpib-general] patch proposal

Linux GPIB Driver package (source)

[Linux-gpib-general] patch proposal