Web Images Videos Maps News Groups Gmail more »
Recently Visited Groups | Help | Sign in
Google Groups Home
workQPanic: Kernel work queue overflow / netTask
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  11 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Andreas F  
View profile  
 More options Oct 6 2004, 6:46 am
Newsgroups: comp.os.vxworks
From: glenn_is_c...@hotmail.com (Andreas F)
Date: 6 Oct 2004 03:46:04 -0700
Local: Wed, Oct 6 2004 6:46 am
Subject: workQPanic: Kernel work queue overflow / netTask
Hi,

In our system we get the "workQPanic: Kernel work queue overflow",
though very rarely. This is followed by a reboot.

I understand this is because the processor can't handle all the
interrupts. What I am wondering is if the netTask has anything to do
with it? I read in a letter from 1993 that the netTask is responsible
for clearing the work queue. I wonder if this still is the case, cause
I can't find anything about it in  the documentation.

Unfortunately, our system has tasks running with higher priorities
than netTask, but according to the documentation this should only
affect the documentation.


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
joe durusau  
View profile  
 More options Oct 6 2004, 7:45 am
Newsgroups: comp.os.vxworks
From: joe durusau <joe.duru...@lmco.com>
Date: Wed, 06 Oct 2004 07:45:58 -0400
Local: Wed, Oct 6 2004 7:45 am
Subject: Re: workQPanic: Kernel work queue overflow / netTask

Andreas F wrote:
> Hi,

> In our system we get the "workQPanic: Kernel work queue overflow",
> though very rarely. This is followed by a reboot.

> I understand this is because the processor can't handle all the
> interrupts. What I am wondering is if the netTask has anything to do
> with it? I read in a letter from 1993 that the netTask is responsible
> for clearing the work queue. I wonder if this still is the case, cause
> I can't find anything about it in  the documentation.

> Unfortunately, our system has tasks running with higher priorities
> than netTask, but according to the documentation this should only
> affect the documentation.

    You will crash the system if packets are arriving faster than netTask
can
handle them.  One possible cause is tasks with pri higher than netTask
that are consuming too much time.  With windView or some such,
you can monitor the system idle time.  Most folks would suggest that
under normal conditions, a minimum of 50% idle time should exist.

Speaking only for myself,

Joe Durusau


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andreas F  
View profile  
 More options Oct 6 2004, 11:03 am
Newsgroups: comp.os.vxworks
From: glenn_is_c...@hotmail.com (Andreas F)
Date: 6 Oct 2004 08:03:29 -0700
Local: Wed, Oct 6 2004 11:03 am
Subject: Re: workQPanic: Kernel work queue overflow / netTask

glenn_is_c...@hotmail.com (Andreas F) wrote in message <news:ccd61736.0410060246.50bbbdf2@posting.google.com>...
> Hi,

> In our system we get the "workQPanic: Kernel work queue overflow",
> though very rarely. This is followed by a reboot.

> I understand this is because the processor can't handle all the
> interrupts. What I am wondering is if the netTask has anything to do
> with it? I read in a letter from 1993 that the netTask is responsible
> for clearing the work queue. I wonder if this still is the case, cause
> I can't find anything about it in  the documentation.

> Unfortunately, our system has tasks running with higher priorities
> than netTask, but according to the documentation this should only
> affect the documentation.

Oops! The last line should be "this should only affect the debugging
capabilities." ;)

Also, I wonder if it is not the netTask who services the kernel work
queue, which task does?


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John  
View profile  
 More options Oct 7 2004, 1:38 am
Newsgroups: comp.os.vxworks
From: john_94...@yahoo.com (John)
Date: 6 Oct 2004 22:38:06 -0700
Local: Thurs, Oct 7 2004 1:38 am
Subject: Re: workQPanic: Kernel work queue overflow / netTask
Hello,

> I understand this is because the processor can't handle all the
> interrupts. What I am wondering is if the netTask has anything to do
> with it? I read in a letter from 1993 that the netTask is responsible
> for clearing the work queue. I wonder if this still is the case, cause
> I can't find anything about it in  the documentation.

Well, you have been misinformed since the netTask has never been
responsible for processing the kernel's work queue (having a network
stack is not even a requirement for VxWorks). The work queue contains
kernel operations, such as a semGive or a msgQSend, that occurred in
the ISR associated with an interrupt that happened while the system
was in kernel state.

OK, that was a complicated sentence, so here's an example:

   Task 1 calls semGive().

   semGive() enters what is called "kernel state" - a special
protected state
   that prevents corruption of kernel data, but does not require
interrupts to
   be disabled.

   An interrupt occurs.

   The ISR tries to give a semaphore to release a task later on.

   Since the system is already in kernel state, that semGive is added
to the
   work queue.

   The ISR exits, returning control to the task level semGive (no
rescheduling
   can happen here since we are in kernel state).

   semGive completes and calls windExit to leave kernel state.

   windExit processes any jobs that are pending in the work queue, and
then
   either returns to the current task's code, or invokes the scheduler
(based
   on whether the head of the ready queue has changed as a result of
either
   of the semaphore operations).

So, in a way any task may process the work queue, although really it
is windExit that does so. It will always happen in the context of a
task though (whichever one entered kernel state just before the
interrupt arrived).

This design, like any, has pros and cons. The pros are much reduced
interrupt latency since there are few places that interrupts are
blocked. One of the cons is that the queue is a fixed depth, and if it
fills up before it can be processed then the system panics and
reboots.

Often these panic reboots are caused by a buggy interrupt handler (one
that does not exit for example, but keeps spinning in an attempt to
handle as many events as possible for a device). High speed network
devices are particularly vulnerable to this since they tend to loop
through the incoming slots in the hardware ring looking for the end -
if the device is filling the slots faster than the CPU processes them
(which can happen for high speed network devices on slower processor
systems), then this ISR might never exit, resulting in a work queue
overflow. A ping flood from a remote machine while a task is running
semGive in a tight loop is often a good way to check for these types
of problem.

HTH,
John...

=====
Contribute to the VxWorks Cookbook at: http://books.bluedonkey.org/


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jeff C r e e.m  
View profile  
 More options Oct 7 2004, 7:21 am
Newsgroups: comp.os.vxworks
From: "Jeff C r e e.m" <jcr...@yahoo.com>
Date: Thu, 07 Oct 2004 11:21:20 GMT
Local: Thurs, Oct 7 2004 7:21 am
Subject: Re: workQPanic: Kernel work queue overflow / netTask

"John" <john_94...@yahoo.com> wrote in message

news:488e459a.0410062138.725e6e25@posting.google.com...

> Hello,

>> I understand this is because the processor can't handle all the
>> interrupts. What I am wondering is if the netTask has anything to do
>> with it? I read in a letter from 1993 that the netTask is responsible
>> for clearing the work queue. I wonder if this still is the case, cause
>> I can't find anything about it in  the documentation.

> Well, you have been misinformed since the netTask has never been
> responsible for processing the kernel's work queue (having a network
> stack is not even a requirement for VxWorks). The work queue contains
> kernel operations, such as a semGive or a msgQSend, that occurred in
> the ISR associated with an interrupt that happened while the system
> was in kernel state.

I suspect (part) of this misconception comes from the (somewhat) common
usage
of netJobAdd by programmers to occasionally kick offsomething short lived to
run
at the task level while in an ISR.

I have seen a few instances where someone thought it was a good idea do use
netJobAdd to run all non
network sporadic task level processing and it ended up filling up the
mailbox that queues these requests.


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andreas F  
View profile  
 More options Oct 8 2004, 5:54 am
Newsgroups: comp.os.vxworks
From: glenn_is_c...@hotmail.com (Andreas F)
Date: 8 Oct 2004 02:54:54 -0700
Local: Fri, Oct 8 2004 5:54 am
Subject: Re: workQPanic: Kernel work queue overflow / netTask
Thanks, your answers have all been very useful.

/Andreas


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Ignacio G.T.  
View profile  
 More options Oct 8 2004, 10:42 am
Newsgroups: comp.os.vxworks
From: igtorque.rem...@evomer.yahoo.es (Ignacio G.T.)
Date: Fri, 08 Oct 2004 14:42:13 GMT
Local: Fri, Oct 8 2004 10:42 am
Subject: Re: workQPanic: Kernel work queue overflow / netTask
On 6 Oct 2004 22:38:06 -0700, john_94...@yahoo.com (John) wrote:

>Hello,

[x]

>Often these panic reboots are caused by a buggy interrupt handler (one
>that does not exit for example, but keeps spinning in an attempt to
>handle as many events as possible for a device). High speed network
>devices are particularly vulnerable to this since they tend to loop
>through the incoming slots in the hardware ring looking for the end -

You are right. We experienced such a problem when we used our device in F.O.
redundancy rings. There is a network protocol (spanning tree protocol) that
routers can use in order to break logically those rings; if it didn't exist, a
message could travel many times on the ring until its eventual extinction.

Unfortunately, this breaking of the loop is not immediate: it takes from one to
several seconds to break a loop. When we tested the case of a physically broken
(opened) ring being closed, we found that a single broadcast message sent by a
device during the first second after the re-closure could appear in all the
devices even hundred of thousands times in a second !

These flood made all our devices crash (with workQPanic). And yes, we eventually
found that the problem lay in a buggy network driver.

--
Ignacio G.T.


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
dpstrand  
View profile  
 More options Oct 13 2004, 10:29 pm
Newsgroups: comp.os.vxworks
From: dpstr...@gmail.com (dpstrand)
Date: 13 Oct 2004 19:29:46 -0700
Local: Wed, Oct 13 2004 10:29 pm
Subject: Re: workQPanic: Kernel work queue overflow / netTask

We have/had a similar problem here on an embedded i960 chip that uses
it's PCI bus messaging unit as it's network device for token ring
network packets. The other chipset that places messages in this poor
little i960's messaging unit can place messages faster than they can
be serviced, and in this situation the ISR that services the messaging
unit will continuously get called resulting in this workQ panic.

I know the real solution would be to throttle the host unit to not
place messages so quickly, but does anyone know of a clean way to
protect from this scenario in the ISR of the slower unit? I have
experimented with looking at the size of the workQ and breaking out of
the ISR when approaching the limit of 64, and it seems to be prevent
the crash, but packets will begin to experience latency during times
of burst traffic as they won't get serviced until the next messages
are placed.

David Strand


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Friedrich Ensslin  
View profile  
 More options Oct 21 2004, 4:42 pm
Newsgroups: comp.os.vxworks
From: Friedrich Ensslin <fr...@ask.me>
Date: Thu, 21 Oct 2004 22:42:46 +0200
Local: Thurs, Oct 21 2004 4:42 pm
Subject: Re: workQPanic: Kernel work queue overflow / netTask
Hi,

according to my observations, it seems to be the excTask that is
responsible for executing those jobs queued by ISRs.

Fritz


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
John  
View profile  
 More options Oct 22 2004, 4:25 am
Newsgroups: comp.os.vxworks
From: john_94...@yahoo.com (John)
Date: 22 Oct 2004 01:25:29 -0700
Local: Fri, Oct 22 2004 4:25 am
Subject: Re: workQPanic: Kernel work queue overflow / netTask
Hello,

> according to my observations, it seems to be the excTask that is
> responsible for executing those jobs queued by ISRs.

Your observations are incorrect. The exception task handles a few
clean up operations for the OS (the end of taskDelete() processing
when a task is deleting itself for example), and some work relating to
the display of h/w exception messages.

The kernel's work queue, as I stated in an earlier post, is processed
by the scheduling code. There are actually several places where the
queue is checked and any contents processed, but all are in the
scheduling code. That essentially means that they are not running in
any task, though they will be using the stack of the interrupted task
in reality for 5.x (in AE things are a little more complex - getting
this right in a multi-address space environment took a lot of care!).

Finally, there is the network task's ring buffer, which is what
netJobAdd() drops things in. This is used by network driver interrupt
routines to defer processing to task level, and also by network
related watchdogs, such as phy/link monitoring tasks.

HTH,
John...

=====
Contribute to the VxWorks Cookbook at: http://books.bluedonkey.org/


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
s.subbarayan  
View profile  
 More options Oct 27 2004, 12:17 am
Newsgroups: comp.os.vxworks
From: s_subbara...@rediffmail.com (s.subbarayan)
Date: 26 Oct 2004 21:17:53 -0700
Local: Wed, Oct 27 2004 12:17 am
Subject: Re: workQPanic: Kernel work queue overflow / netTask
John,
   Your reply states:"The kernel's work queue, as I stated in an
earlier post, is processed
by the scheduling code. There are actually several places where the
queue is checked and any contents processed, but all are in the
scheduling code. That essentially means that they are not running in
any task, though they will be using the stack of the interrupted task
in reality for 5.x (in AE things are a little more complex - getting
this right in a multi-address space environment took a lot of care!)."
    Out of curiosity asking this query:What will be the possible
several spaces where the queue will be checked?whats the frequency in
which this will happen?
What should be optimum frequency this checking should happen?IMHO
there are lots of pros and cons in this checking frequency,while
checking it too often will be inefficient interms of processing time
(and case will be much worse when scheduling is not required at the
checking moment!!!) checking it very little often will make the
scheduler miss the scheduling of the tasks which is much more worser
then checking it frequently.How does windriver handle this situation?
I will be happy if you could throw some pointers on this or point me
to some links where I can learn this stuff.Though I agree that the
above query is for learning purpose,it will be helpful to me to make
proper designs for my applications on vxworks if I have clear
understanding of the above query.
Advanced thanks for your replies,
Regards,
s.subbarayan


    Reply to author    Forward  
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »

Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google