Discussion:
Custom IRQ priorities
(too old to reply)
James Harris
2021-04-17 10:56:24 UTC
Permalink
It's not possible to implement custom IRQ priorities on a traditional
PC, right?

That's what I've thought for many years but I might have finally found a
way to do it!

Here's the idea. Feel free to comment.

The principle is to get the PICs to inform us immediately of any IRQs
which are requesting service so we gain a complete picture of all the
IRQs which require attention.

Then, since we will know of all IRQs which need attention we can process
them in whatever order we want.

The key change from normal is for an interrupt to be EOId before it is
handled rather than afterwards, and for the IMR to be used to prevent
that same interrupt number from firing again.

But as well as masking an interrupt it would be added to a waiting list.
The waiting list could be as simple as a bit array with one bit for each
IRQ number or it could be more extensive but the point is that it would
keep track of which IRQs had been signalled but had not yet been processed.

To illustrate, here's a piece of pseudocode to show what would happen
when an interrupt comes in. All that would have happened before it
starts is that the IRQ number would have been determined and EOI would
have been issued automatically (the AEOI setting).

On an interrupt:

Push a minimal set of registers
Mask this interrupt in the relevant IMR
Add this interrupt to the set of waiting interrupts
...
Pop the registers pushed earlier
iret

Aside from saving and restoring registers all that would be done would
be to mask off the current interrupt and to leave a note that the
interrupt is awaiting service.

If half a dozen interrupts all fire at once then each of the six would,
in turn, follow the above code path. That would leave all six masked off
and recorded as awaiting service.

But, naturally, something has to do the actual servicing. It could be
one of the interrupts or a high-priority task.

In the code below I'll make it one of the interrupts. I'll use a nesting
level to determine whether an interrupt is already being serviced or
not. Nesting level will be zero on initial entry and will be 1 if an
interrupt is already being serviced. There would only be the two levels.

The code in the above which is shown as "..." would be

If nesting level is zero
Increment nesting level
Push more registers, as required
Loop while there are any interrupts waiting
Pick the IRQ /we/ want to treat as of highest priority
Delete it from waiting list
Enable interrupts
Handle the IRQ
Disable interrupts
Unmask the IRQ
Endloop
Pop the registers pushed earlier in this fragment
Decrement nesting level
Endif

The loop would take interrupts in any order we wanted, thus implementing
custom prioritisation, and it would unmask each one as it completed it.

If any new interrupts fired while the code was running they would be
added to the waiting list and then would be processed in their turn
before the loop completed.

Once the waiting list was empty the loop would terminate. The code would
then, as normal, ireq back to whatever had been interrupted.

I think that's it. What do you think? Would it work? Can it be improved?
--
James Harris
Rod Pemberton
2021-04-17 15:20:33 UTC
Permalink
On Sat, 17 Apr 2021 11:56:24 +0100
Post by James Harris
It's not possible to implement custom IRQ priorities on a traditional
PC, right?
That's what I've thought for many years but I might have finally
found a way to do it!
Here's the idea. Feel free to comment.
The principle is to get the PICs to inform us immediately of any IRQs
which are requesting service so we gain a complete picture of all the
IRQs which require attention.
Then, since we will know of all IRQs which need attention we can
process them in whatever order we want.
The key change from normal is for an interrupt to be EOId before it
is handled rather than afterwards, and for the IMR to be used to
prevent that same interrupt number from firing again.
So, you're splitting the typical interrupt routine into two parts? ...

I.e., you block the interrupt with IMR, add to a list of interrupts to
process, and then simply return from the initial IRQ or INT. Later,
from your interrupt scheduler, you call the interrupt handling routine
to actually process the interrupt based upon your preferred interrupt
preferences, clearing IMR afterward, and/or clearing EOI if it was a
hardware interrupt.
Post by James Harris
But as well as masking an interrupt it would be added to a waiting
list. The waiting list could be as simple as a bit array with one bit
for each IRQ number or it could be more extensive but the point is
that it would keep track of which IRQs had been signalled but had not
yet been processed.
Are your main interrupt handling routines re-entrant or being masked off
with CLI/STI?
Post by James Harris
To illustrate, here's a piece of pseudocode to show what would happen
when an interrupt comes in. All that would have happened before it
starts is that the IRQ number would have been determined and EOI
would have been issued automatically (the AEOI setting).
Push a minimal set of registers
Mask this interrupt in the relevant IMR
Add this interrupt to the set of waiting interrupts
...
Pop the registers pushed earlier
iret
If it's possible to mask the IMR without destroying registers, then
you'd want to move the push of registers to the third step, so as to
mask IMR off as quickly as possible. E.g., use values in memory and
constants - instead of stack or registers - to set IMR mask and set the
list of waiting interrupts.


However, I don't think that is what you're wanting. I'm not entirely
sure what you're doing yet, but I'd think that what you're attempting to
do this:

Reprogram PICs so that software INTs and hardware IRQs are on different
interrupts. Otherwise, you must use the PIC's In-Service Register (ISR)
register to separate them.

On initial interrupt: (either INT or IRQ)
/* IRQ or INT number is known. IMR mask is known. */
/* CLI/STI wrapper may be needed for PM ... */
Non-destructively, mask this interrupt in the relevant IMR
Non-destructively, add this interrupt to the set of waiting interrupts
IRET

Interrupt handler routine: (... to be called later)
CLI
PUSHA
... (Handle interrupt)
Clear IMR for interrupt
For hardware interrupts, clear the *specific* EOI for the interrupt
POPA
STI
RET

Scheduler:
Loop {
Call interrupt handler routines based on new priority
}


"Non-destructively" would be a MOV or OR against memory location with
hardcoded INT/IRQ number or hardcoded IMR mask. E.g., you'll have many
generic initial interrupt routines, each of which would have the
hardcoded IRQ/INT number used with MOV or OR, as they're being called
directly from your IDT/IVT table, or specifically from interrupt gates
for PM.

Most OSes do a generic EOI clear, as as EOIs are being handled in-order,
in-time. So, I strongly suspect that the *specific* EOI likely must
be cleared, since EOIs are now being handled out-of-order (as I did for
my OS). I.e., you could possibly clear the wrong EOI with a generic EOI
clear. (Yes, I'm not entirely sure about that bit on EOIs ... Somewhat
unsure actually. So, don't quote me. Keep it in mind and verify it
yourself.)
Post by James Harris
Aside from saving and restoring registers all that would be done
would be to mask off the current interrupt and to leave a note that
the interrupt is awaiting service.
If half a dozen interrupts all fire at once then each of the six
would, in turn, follow the above code path. That would leave all six
masked off and recorded as awaiting service.
But, naturally, something has to do the actual servicing. It could be
one of the interrupts or a high-priority task.
In the code below I'll make it one of the interrupts. I'll use a
nesting level to determine whether an interrupt is already being
serviced or not. Nesting level will be zero on initial entry and will
be 1 if an interrupt is already being serviced. There would only be
the two levels.
The code in the above which is shown as "..." would be
Sorry, I made an assumption that you were splitting your routine into
two, but instead, it seems you've rolled it into one.
Post by James Harris
If nesting level is zero
Increment nesting level
Push more registers, as required
Loop while there are any interrupts waiting
Pick the IRQ /we/ want to treat as of highest priority
Delete it from waiting list
Enable interrupts
Handle the IRQ
Disable interrupts
Unmask the IRQ
Endloop
Pop the registers pushed earlier in this fragment
Decrement nesting level
Endif
Is this and the earlier portion intended to be re-entrant or should the
entire thing be disabled entirely from interrupts? I.e., it seems you
pushed CLI/STI down to just prior to calling the IRQ routine.


As I didn't recall, I was trying to review as to when *exactly* the IF
flag is cleared (IF=0) causing interrupts to be blocked, and this is
what I have so far:

a) CLI instruction in RM
b) CLI instruction in PM, if CPL<=IOPL, otherwise GP fault
c) CLI instruction in V86, if IOPL=3, otherwise GP fault
d) hardware IRQ via INTR pin
e) PM interrupt gate, via a CALL instruction to the gate
f) INT instruction in RM (i.e., RM software INT)
g) INT instruction redirected to a PM interrupt gate for V86 mode
h) INT instruction due to a privilege level change in PM

(Wikipedia is missing almost all of that except CLI ...)

Apparently, at least, according to the x86 INT instruction "Operation"
description that I'm reading online, the IF flag is *NOT* cleared for
a normal software INT instruction when in PM. I.e., PM software INT
won't clear the IF flag, but a RM software INT will clear the IF flag.

WTF??? ...

If that's correct, as I haven't checked against the pdf manuals
..., that would imply that maybe the CLI/STI pair needs to be moved to
the start/end of the entire merged routine for PM code, as the IF flag
wouldn't be cleared automatically for PM software INTs, unlike for RM
INTs. Hardware IRQs will clear IF flag for any mode. Also note that
CLI only works in PM if the CPL and IOPL are set properly.


It also appears that you've embedded the round-robbin scheduler within
the "..." section. Since you're actually PUSHing/POPing all registers
once the "..." section is merged into the earlier section, I'd just use
PUSHA/POPA around the entire thing (faster), and I would also attempt
to mask IMR non-destructively (with MOV or OR or AND) as quick as
possible prior to other code.
Post by James Harris
The loop would take interrupts in any order we wanted, thus
implementing custom prioritisation, and it would unmask each one as
it completed it.
If any new interrupts fired while the code was running they would be
added to the waiting list and then would be processed in their turn
before the loop completed.
Once the waiting list was empty the loop would terminate. The code
would then, as normal, ireq back to whatever had been interrupted.
...
Post by James Harris
I think that's it. What do you think? Would it work? Can it be
improved?
See the two-part split method above which I initially assumed you were
using.

--
wolfgang kern
2021-04-17 19:52:27 UTC
Permalink
On 17.04.2021 17:20, Rod Pemberton wrote:
...
Post by Rod Pemberton
Reprogram PICs so that software INTs and hardware IRQs are on different
interrupts. Otherwise, you must use the PIC's In-Service Register (ISR)
register to separate them.
...
mmh... software INT on PIC ??? :)
IRQ-pins are only connected to HW. But you sure knew that.
__
wolfgang
Rod Pemberton
2021-04-18 17:33:38 UTC
Permalink
On Sat, 17 Apr 2021 21:52:27 +0200
Post by wolfgang kern
Post by Rod Pemberton
Reprogram PICs so that software INTs and hardware IRQs are on
different interrupts. Otherwise, you must use the PIC's In-Service
Register (ISR) register to separate them.
...
mmh... software INT on PIC ??? :)
Are you asking, "Who would call IRQ14 with an INT 76h instruction?"
Post by wolfgang kern
mmh... software INT on PIC ??? :)
If they aren't on the same interrupts, then why reprogram the PICs? ...
Post by wolfgang kern
IRQ-pins are only connected to HW. But you sure knew that.
Yes, but by default, a single interrupt routine can service hardware,
software, or processor exceptions, faults, or traps.

The PICs are usually reprogrammed to avoid overloading the interrupt.
But I know that you know that.

--
wolfgang kern
2021-04-19 13:27:50 UTC
Permalink
Post by Rod Pemberton
On Sat, 17 Apr 2021 21:52:27 +0200
Post by wolfgang kern
Post by Rod Pemberton
Reprogram PICs so that software INTs and hardware IRQs are on
different interrupts. Otherwise, you must use the PIC's In-Service
Register (ISR) register to separate them.
...
mmh... software INT on PIC ??? :)
Are you asking, "Who would call IRQ14 with an INT 76h instruction?"
Post by wolfgang kern
mmh... software INT on PIC ??? :)
If they aren't on the same interrupts, then why reprogram the PICs? ...
OK, just wrong order of words :) software INTs never touch the PIC.
Post by Rod Pemberton
Post by wolfgang kern
IRQ-pins are only connected to HW. But you sure knew that.
Yes, but by default, a single interrupt routine can service hardware,
software, or processor exceptions, faults, or traps.
Yeah, but I see such merging as a really bad concept and there is no
need at all to let hardware-assigned INT-numbers abused by software.
Post by Rod Pemberton
The PICs are usually reprogrammed to avoid overloading the interrupt.
But I know that you know that.
Yes, I'd add "to avoid conflicts".
Finally I've gone the APIC way to avoid shared IRQs, but some
PCI-devices wont/can't use other than IRQ_11.
__
wolfgang
James Harris
2021-04-18 10:36:46 UTC
Permalink
Post by Rod Pemberton
On Sat, 17 Apr 2021 11:56:24 +0100
...
Post by Rod Pemberton
So, you're splitting the typical interrupt routine into two parts? ...
Effectively, yes.
Post by Rod Pemberton
I.e., you block the interrupt with IMR, add to a list of interrupts to
process, and then simply return from the initial IRQ or INT.
Exactly, but only if an IRQ is already being handled.

If, by contrast, no IRQ is currently being handled then the code also
enters a loop to keep processing IRQs until all have been handled.
Post by Rod Pemberton
Later,
from your interrupt scheduler,
Not from a separate scheduler but from within the loop (which runs as
part of normal IRQ handling).

In a separate reply I'll put some real code which should hopefully make
the proposal much clearer.
Post by Rod Pemberton
you call the interrupt handling routine
to actually process the interrupt based upon your preferred interrupt
preferences, clearing IMR afterward,
Yes.
Post by Rod Pemberton
and/or clearing EOI if it was a
hardware interrupt.
This is only about hardware interrupts, no others.

In the proposal EOI is issued at the beginning of the routine for every
IRQ which occurs. In fact, PICs which support automatic EOI could issue
the EOI even before we get control. An Intel 8259A datasheet says:

"In the AEOI mode the ISR bit is reset at the end of the third INTA pulse."

(I believe EOI just clears the relevant bit in the 8259's ISR.)
Post by Rod Pemberton
Post by James Harris
But as well as masking an interrupt it would be added to a waiting
list. The waiting list could be as simple as a bit array with one bit
for each IRQ number or it could be more extensive but the point is
that it would keep track of which IRQs had been signalled but had not
yet been processed.
Are your main interrupt handling routines re-entrant or being masked off
with CLI/STI?
The intention is to enter the individual interrupt handlers with
interrupts enabled. The handlers can disable interrupts over parts of
their code if they need to.

...
Post by Rod Pemberton
I'm not entirely
sure what you're doing yet, but I'd think that what you're attempting to
Reprogram PICs so that software INTs and hardware IRQs are on different
interrupts.
For sure. Take that as already done. This proposal is wholly about
hardware interrupts (which I try to remember to refer to as IRQs to
distinguish them from software interrupts).

...
Post by Rod Pemberton
"Non-destructively" would be a MOV or OR against memory location with
hardcoded INT/IRQ number or hardcoded IMR mask. E.g., you'll have many
generic initial interrupt routines, each of which would have the
hardcoded IRQ/INT number used with MOV or OR, as they're being called
directly from your IDT/IVT table, or specifically from interrupt gates
for PM.
That's an intriguing idea but I think I can make the code a bit more
generic while still keeping good performance. See the code I'm about to
post in a separate reply.
Post by Rod Pemberton
Most OSes do a generic EOI clear, as as EOIs are being handled in-order,
in-time. So, I strongly suspect that the *specific* EOI likely must
be cleared, since EOIs are now being handled out-of-order (as I did for
my OS). I.e., you could possibly clear the wrong EOI with a generic EOI
clear. (Yes, I'm not entirely sure about that bit on EOIs ... Somewhat
unsure actually. So, don't quote me. Keep it in mind and verify it
yourself.)
A non-specific EOI ought to be enough because we always EOI the
/current/ IRQ.

In fact, where it's available I'd use Auto EOI to save time.

But note this from an Intel 8259A datasheet:

"The AEOI mode can only be used in a master 8259A
and not a slave. 8259As with a copyright date of
1985 or later will operate in the AEOI mode as a
master or a slave."

Until I find a way in software to determine whether an 8259 is of the
bug-fixed type (ahem) I may need to do a manual EOI for the slave PIC.
The EOI to the slave would go at the top of the routine shortly after
saving the base set of registers.

And I think it could be a non-specific EOI because it would be EOIing
the current IRQ.

...
Post by Rod Pemberton
Sorry, I made an assumption that you were splitting your routine into
two, but instead, it seems you've rolled it into one.
You are right that there are two execution paths for when an IRQ fires:

One does:

Mask the IRQ
Add it to the waiting set
iret

The other does

Mask the IRQ
Add it to the waiting set
Loop over all those waiting
iret

The only difference is the loop.

If a number of IRQs fire at the same time the first to be executed will
take the path which includes the loop. While the loop is running,
however, all further IRQs will take the shorter path which does not
include the loop.

...
Post by Rod Pemberton
Apparently, at least, according to the x86 INT instruction "Operation"
description that I'm reading online, the IF flag is *NOT* cleared for
a normal software INT instruction when in PM. I.e., PM software INT
won't clear the IF flag, but a RM software INT will clear the IF flag.
Not quite. Take a look at

https://css.csail.mit.edu/6.858/2012/readings/i386/INT.htm

If I read it right it says that in Protected Mode IF will be zeroed if
the interrupt vectors via an interrupt gate but that if the IDT contains
a trap gate instead then IF will not be zeroed.

In practice an OS can protect itself from unwanted software INTs by the
DPL.

IF software interrupt (* i.e. caused by INT n, INT 3, or INTO *)
THEN
IF gate descriptor DPL < CPL
THEN #GP(vector number * 8+2+EXT);
FI;

...
Post by Rod Pemberton
If that's correct, as I haven't checked against the pdf manuals
..., that would imply that maybe the CLI/STI pair needs to be moved to
the start/end of the entire merged routine for PM code, as the IF flag
wouldn't be cleared automatically for PM software INTs, unlike for RM
INTs. Hardware IRQs will clear IF flag for any mode. Also note that
CLI only works in PM if the CPL and IOPL are set properly.
I don't think I need them. AIUI the hardware begins the
interrupt-handling code with interrupts disabled.
Post by Rod Pemberton
It also appears that you've embedded the round-robbin scheduler within
the "..." section. Since you're actually PUSHing/POPing all registers
once the "..." section is merged into the earlier section, I'd just use
PUSHA/POPA around the entire thing (faster), and I would also attempt
to mask IMR non-destructively (with MOV or OR or AND) as quick as
possible prior to other code.
As for pushing and popping, take a look at the code I am about to send
in a separate reply. It needs only four registers so it only saves four.
You are right that pusha and popa are tempting, though.
--
James Harris
Rod Pemberton
2021-04-18 17:08:35 UTC
Permalink
On Sun, 18 Apr 2021 11:36:46 +0100
Post by James Harris
Post by Rod Pemberton
On Sat, 17 Apr 2021 11:56:24 +0100
Post by James Harris
...
So, you're splitting the typical interrupt routine into two parts? ...
This is only about hardware interrupts, no others.
In the proposal EOI is issued at the beginning of the routine for
every IRQ which occurs. In fact, PICs which support automatic EOI
could issue the EOI even before we get control. An Intel 8259A
"In the AEOI mode the ISR bit is reset at the end of the third INTA pulse."
(I believe EOI just clears the relevant bit in the 8259's ISR.)
So, you're saying that IMR will block future hardware interrupts, so
that you can immediately clear EOI for the current hardware interrupt?
Ok.
Post by James Harris
Post by Rod Pemberton
Post by James Harris
But as well as masking an interrupt it would be added to a waiting
list. The waiting list could be as simple as a bit array with one
bit for each IRQ number or it could be more extensive but the
point is that it would keep track of which IRQs had been signalled
but had not yet been processed.
Are your main interrupt handling routines re-entrant or being
masked off with CLI/STI?
The intention is to enter the individual interrupt handlers with
interrupts enabled. The handlers can disable interrupts over parts of
their code if they need to.
...
Safe?
Post by James Harris
Post by Rod Pemberton
Most OSes do a generic EOI clear, as as EOIs are being handled
in-order, in-time. So, I strongly suspect that the *specific*
EOI likely must be cleared, since EOIs are now being handled
out-of-order (as I did for my OS). I.e., you could possibly clear
the wrong EOI with a generic EOI clear. (Yes, I'm not entirely
sure about that bit on EOIs ... Somewhat unsure actually. So,
don't quote me. Keep it in mind and verify it yourself.)
A non-specific EOI ought to be enough because we always EOI the
/current/ IRQ.
So, you're saying that your non-specific EOI occurs quickly enough or
early enough in the code that out-of-order EOIs won't be an issue. Ok.
Post by James Harris
Post by Rod Pemberton
Sorry, I made an assumption that you were splitting your routine
into two, but instead, it seems you've rolled it into one.
Mask the IRQ
Add it to the waiting set
iret
The other does
Mask the IRQ
Add it to the waiting set
Loop over all those waiting
iret
The only difference is the loop.
If a number of IRQs fire at the same time the first to be executed
will take the path which includes the loop. While the loop is
running, however, all further IRQs will take the shorter path which
does not include the loop.
Ok. So, you're unrolling the loop.
Post by James Harris
Post by Rod Pemberton
Apparently, at least, according to the x86 INT instruction
"Operation" description that I'm reading online, the IF flag is
*NOT* cleared for a normal software INT instruction when in PM.
I.e., PM software INT won't clear the IF flag, but a RM software
INT will clear the IF flag.
Not quite. Take a look at
https://css.csail.mit.edu/6.858/2012/readings/i386/INT.htm
If I read it right it says that in Protected Mode IF will be zeroed
if the interrupt vectors via an interrupt gate but that if the IDT
contains a trap gate instead then IF will not be zeroed.
In the version you posted a link to, the PM interrupt falls through to
a trap gate or an interrupt gate.

The version which I was reading, has an extra line in the
PROTECTED-MODE section:

(* PE=1, DPL<CPL, software interrupt *)

It's a comment line. Unfortunately, I took the "software interrupt" as
a marker to indicate that the PM interrupt is called at that point.
They used a comment as a marker at another point in their pseudo-code:

...
(* Starts execution of new routine in Protected Mode *)
END;

http://qcd.phys.cmu.edu/QCDcluster/intel/vtune/reference/vc140.htm
Post by James Harris
Post by Rod Pemberton
If that's correct, as I haven't checked against the pdf manuals
..., that would imply that maybe the CLI/STI pair needs to be moved
to the start/end of the entire merged routine for PM code, as the
IF flag wouldn't be cleared automatically for PM software INTs,
unlike for RM INTs. Hardware IRQs will clear IF flag for any mode.
Also note that CLI only works in PM if the CPL and IOPL are set
properly.
I don't think I need them. AIUI the hardware begins the
interrupt-handling code with interrupts disabled.
I was attempting to determine all instances where IF can be cleared.
The complexity of the INT instruction flow doesn't make it clear. Even
after adjusting for some mistakes (below), it still appears to me that
IF isn't cleared for two INT instruction sections ... Maybe, I'll
attempt to reread it some other time.

The formatting of the page I was reading is messed up too. There is no
spacing between some sections. So, my list is incorrect. I don't
immediately see an IF clear in TASK-GATE or
INTERRUPT-FROM-VIRTUAL-8086-MODE, but they could be calling other
sections. There appears to be an IF clear in INTRA- and
INTER-PRIVILEGE-LEVEL-INTERRUPT. So, g) in my list is wrong. g) should
be for a task gate. Sigh, it's too hard to follow that code ... I.e.,
there may still be a possibility that IF isn't always clear, e.g., task
gate or v86. I'm probably still not reading that correctly.

--
wolfgang kern
2021-04-19 13:37:32 UTC
Permalink
Post by Rod Pemberton
On Sun, 18 Apr 2021 11:36:46 +0100
Post by James Harris
Post by Rod Pemberton
On Sat, 17 Apr 2021 11:56:24 +0100
Post by James Harris
...
So, you're splitting the typical interrupt routine into two parts? ...
This is only about hardware interrupts, no others.
In the proposal EOI is issued at the beginning of the routine for
every IRQ which occurs. In fact, PICs which support automatic EOI
could issue the EOI even before we get control. An Intel 8259A
"In the AEOI mode the ISR bit is reset at the end of the third INTA pulse."
(I believe EOI just clears the relevant bit in the 8259's ISR.)
So, you're saying that IMR will block future hardware interrupts, so
that you can immediately clear EOI for the current hardware interrupt?
Ok.
Post by James Harris
Post by Rod Pemberton
Post by James Harris
But as well as masking an interrupt it would be added to a waiting
list. The waiting list could be as simple as a bit array with one
bit for each IRQ number or it could be more extensive but the
point is that it would keep track of which IRQs had been signalled
but had not yet been processed.
Are your main interrupt handling routines re-entrant or being
masked off with CLI/STI?
The intention is to enter the individual interrupt handlers with
interrupts enabled. The handlers can disable interrupts over parts of
their code if they need to.
...
Safe?
Post by James Harris
Post by Rod Pemberton
Most OSes do a generic EOI clear, as as EOIs are being handled
in-order, in-time. So, I strongly suspect that the *specific*
EOI likely must be cleared, since EOIs are now being handled
out-of-order (as I did for my OS). I.e., you could possibly clear
the wrong EOI with a generic EOI clear. (Yes, I'm not entirely
sure about that bit on EOIs ... Somewhat unsure actually. So,
don't quote me. Keep it in mind and verify it yourself.)
A non-specific EOI ought to be enough because we always EOI the
/current/ IRQ.
So, you're saying that your non-specific EOI occurs quickly enough or
early enough in the code that out-of-order EOIs won't be an issue. Ok.
Post by James Harris
Post by Rod Pemberton
Sorry, I made an assumption that you were splitting your routine
into two, but instead, it seems you've rolled it into one.
Mask the IRQ
Add it to the waiting set
iret
The other does
Mask the IRQ
Add it to the waiting set
Loop over all those waiting
iret
The only difference is the loop.
If a number of IRQs fire at the same time the first to be executed
will take the path which includes the loop. While the loop is
running, however, all further IRQs will take the shorter path which
does not include the loop.
Ok. So, you're unrolling the loop.
Post by James Harris
Post by Rod Pemberton
Apparently, at least, according to the x86 INT instruction
"Operation" description that I'm reading online, the IF flag is
*NOT* cleared for a normal software INT instruction when in PM.
I.e., PM software INT won't clear the IF flag, but a RM software
INT will clear the IF flag.
Not quite. Take a look at
https://css.csail.mit.edu/6.858/2012/readings/i386/INT.htm
If I read it right it says that in Protected Mode IF will be zeroed
if the interrupt vectors via an interrupt gate but that if the IDT
contains a trap gate instead then IF will not be zeroed.
In the version you posted a link to, the PM interrupt falls through to
a trap gate or an interrupt gate.
The version which I was reading, has an extra line in the
(* PE=1, DPL<CPL, software interrupt *)
It's a comment line. Unfortunately, I took the "software interrupt" as
a marker to indicate that the PM interrupt is called at that point.
...
(* Starts execution of new routine in Protected Mode *)
END;
http://qcd.phys.cmu.edu/QCDcluster/intel/vtune/reference/vc140.htm
Post by James Harris
Post by Rod Pemberton
If that's correct, as I haven't checked against the pdf manuals
..., that would imply that maybe the CLI/STI pair needs to be moved
to the start/end of the entire merged routine for PM code, as the
IF flag wouldn't be cleared automatically for PM software INTs,
unlike for RM INTs. Hardware IRQs will clear IF flag for any mode.
Also note that CLI only works in PM if the CPL and IOPL are set
properly.
I don't think I need them. AIUI the hardware begins the
interrupt-handling code with interrupts disabled.
I was attempting to determine all instances where IF can be cleared.
The complexity of the INT instruction flow doesn't make it clear. Even
after adjusting for some mistakes (below), it still appears to me that
IF isn't cleared for two INT instruction sections ... Maybe, I'll
attempt to reread it some other time.
The formatting of the page I was reading is messed up too. There is no
spacing between some sections. So, my list is incorrect. I don't
immediately see an IF clear in TASK-GATE or
INTERRUPT-FROM-VIRTUAL-8086-MODE, but they could be calling other
sections. There appears to be an IF clear in INTRA- and
INTER-PRIVILEGE-LEVEL-INTERRUPT. So, g) in my list is wrong. g) should
be for a task gate. Sigh, it's too hard to follow that code ... I.e.,
there may still be a possibility that IF isn't always clear, e.g., task
gate or v86. I'm probably still not reading that correctly.
here a copy of my pages: (sorry for it's a picture, I lost my upload site)

Loading Image...

wolfgang
James Harris
2021-04-20 10:52:36 UTC
Permalink
Post by Rod Pemberton
On Sun, 18 Apr 2021 11:36:46 +0100
...
Post by Rod Pemberton
Post by James Harris
In the proposal EOI is issued at the beginning of the routine for
every IRQ which occurs. In fact, PICs which support automatic EOI
could issue the EOI even before we get control. An Intel 8259A
"In the AEOI mode the ISR bit is reset at the end of the third INTA pulse."
(I believe EOI just clears the relevant bit in the 8259's ISR.)
So, you're saying that IMR will block future hardware interrupts, so
that you can immediately clear EOI for the current hardware interrupt?
Ok.
Yes, the idea is that once an IRQ fires, its bit in the IMR would be set
(adding to those already set) to prevent further triggering of the same
IRQ.

In the proposal that bit would be cleared once the interrupt had been
handled but there is an alternative option to clear the bit once all
active IRQs have been handled. That would prevent one IRQ from starving
out the others. Or the mask bits of some IRQs could be cleared
immediately while others were cleared at the end. It would depend on
which devices were on which IRQs and how they ought to be treated. Using
the IMRs rather than the traditional certainly EOI gives a lot of options!

In theory, at the instant that the bit is set in the IMR the CPU's
interrupts could be enabled immediately (EOI having already been issued
at this point) though I do wonder if PICs could be slow to give effect
to the setting. Do some PIC chips respond immediately? I don't know.

The one chip I know which really can have a delay after being set is the
KBC but that's because internally it autonomously runs a small internal
program rather than using hardware to carry out what it's told to do.

...
Post by Rod Pemberton
Post by James Harris
Post by Rod Pemberton
Are your main interrupt handling routines re-entrant or being
masked off with CLI/STI?
The intention is to enter the individual interrupt handlers with
interrupts enabled. The handlers can disable interrupts over parts of
their code if they need to.
...
Safe?
I think so. Why would it not be?
Post by Rod Pemberton
Post by James Harris
Post by Rod Pemberton
Most OSes do a generic EOI clear, as as EOIs are being handled
in-order, in-time. So, I strongly suspect that the *specific*
EOI likely must be cleared, since EOIs are now being handled
out-of-order (as I did for my OS). I.e., you could possibly clear
the wrong EOI with a generic EOI clear. (Yes, I'm not entirely
sure about that bit on EOIs ... Somewhat unsure actually. So,
don't quote me. Keep it in mind and verify it yourself.)
A non-specific EOI ought to be enough because we always EOI the
/current/ IRQ.
So, you're saying that your non-specific EOI occurs quickly enough or
early enough in the code that out-of-order EOIs won't be an issue. Ok.
The EOIs won't be out of order. Each would be for the current IRQ. For
example, with standard settings say IRQs 3, 4 and 5 fired at the same
time. The IRQ manager would:

1. Receive IRQ3, EOI it and mask it, then reenable interrupts
2. Receive IRQ4, EOI it and mask it, then reenable interrupts
3. Receive IRQ5, EOI it and mask it, then reenable interrupts

All of those would happen in order.

...
Post by Rod Pemberton
Post by James Harris
Mask the IRQ
Add it to the waiting set
iret
The other does
Mask the IRQ
Add it to the waiting set
Loop over all those waiting
iret
The only difference is the loop.
If a number of IRQs fire at the same time the first to be executed
will take the path which includes the loop. While the loop is
running, however, all further IRQs will take the shorter path which
does not include the loop.
Ok. So, you're unrolling the loop.
No, the loop isn't unrolled. At least not in the compiler sense.

...
Post by Rod Pemberton
I was attempting to determine all instances where IF can be cleared.
The complexity of the INT instruction flow doesn't make it clear. Even
after adjusting for some mistakes (below), it still appears to me that
IF isn't cleared for two INT instruction sections ... Maybe, I'll
attempt to reread it some other time.
The formatting of the page I was reading is messed up too. There is no
spacing between some sections. So, my list is incorrect. I don't
immediately see an IF clear in TASK-GATE or
I'm not sure but wouldn't a task gate load an entire flags register from
the TSS of the new task?
Post by Rod Pemberton
INTERRUPT-FROM-VIRTUAL-8086-MODE, but they could be calling other
In

https://css.csail.mit.edu/6.858/2012/readings/i386/INT.htm

under INTERRUPT-FROM-V86-MODE: it says

IF service through Interrupt Gate THEN IF = 0;
--
James Harris
James Harris
2021-04-18 10:45:42 UTC
Permalink
Post by Rod Pemberton
On Sat, 17 Apr 2021 11:56:24 +0100
...
Post by Rod Pemberton
Post by James Harris
If nesting level is zero
Increment nesting level
Push more registers, as required
Loop while there are any interrupts waiting
Pick the IRQ /we/ want to treat as of highest priority
Delete it from waiting list
Enable interrupts
Handle the IRQ
Disable interrupts
Unmask the IRQ
Endloop
Pop the registers pushed earlier in this fragment
Decrement nesting level
Endif
Is this and the earlier portion intended to be re-entrant or should the
entire thing be disabled entirely from interrupts? I.e., it seems you
pushed CLI/STI down to just prior to calling the IRQ routine.
I am not completely sure what you mean there. It may make the proposal
clearer if I write it as real assembly code. Bear in mind that this is
being made up on the fly. It may/will contain bugs but should make the
intention much clearer.


Let's say that IRQs 0-15 have been mapped to trigger interrupts 32-47
(0x20-0x2F). Here's the initial code for IRQ0.

interrupt_0x20: ;PIT, IRQ0
push eax
mov eax, 0x20
jmp pic0_manager


The other IRQs would be similar. For example, the initial code for the
mouse (IRQ12) would be

interrupt_0x2C: ;Mouse, IRQ12
push eax
mov eax, 0x2C
jmp pic1_manager


Basically the code fragments would save the interrupt number in EAX and
jump to suitable code to deal with the PIC which woke us up. Here is
that code for the master PIC.


;************************************************************
;
; pic0_manager
;
;************************************************************

pic0_manager:
push ebx
push ecx
push edx

;Zero-base the IRQ number by subtracting that of IRQ0
lea ecx, [eax - 0x20] ;Set ECX to IRQ id (range 0 to 7)
cmp ecx, 7 ;Is really in range 0 to 7?
ja panic ;Abort if error in kernel code

;Make the IRQ number in ECX into a mask bit
mov ebx, 1
shl ebx, ecx

;Inhibit this interrupt
mov al, [pic0_mask] ;Fetch current mask
or al, bl ;Or in this bit
mov [pic0_mask], al ;Save new mask
mov dx, 0x21 ;Master PIC's data port
call out_dx_al ;Update the PIC's mask


That should mask off the particular IRQ which led to the code being run.
The next step is to add the IRQ to those waiting for service but it
would make it easier later to determine which IRQ had which priority if
we map the IRQ to a priority now and store the priority rather than the
IRQ number. That can be done by looking up the priority we want to
assign to this IRQ in a table in which each IRQ has a different priority.


mov cx, [priorities + ecx]
cmp cx, 15
ja panic ;Internal error

;Add this interrupt to the waiting array (indexed by priority)
mov ebx, 1
shl ebx, cl
or [waiting], bx

cmp byte [nesting_level], 0 ;Already nested?
ja cleanup ;Yes, skip to cleanup

call pic_loop ;Loop until no IRQ needs service

cleanup:
pop edx
pop ecx
pop ebx
pop eax
iret


That's supposed to show pretty much the complete processing for an IRQ
which fires while another is already being processed.



Priority Table
==============

The priority table could be whatever we wanted. The only requirement is
that each interrupt has its own priority so we can still distinguish
them. In other words, don't make two IRQs to the same priority.


For example, here's a priority table intended to handle the IRQs on the
master PIC before those on the slave.

dw 0, 1, 2, 3, 4, 5, 6, 7
dw 8, 9, 10, 11, 12, 13, 14, 15


Here's one intended to prioritise IRQs 14 and 15 (by assigning them
priorities 0 an 1) then the master PIC, then the rest of the slave.

dw 3, 4, 2, 5, 6, 7, 8, 9
dw 10, 11, 12, 13, 14, 15, 0, 1 ;<-- IRQs 14 & 15 get prio 0 & 1


Here's a table to implement the standard PC/AT priorities, i.e. PIT then
keyboard then RTC etc.

dw 0, 1, 2, 11, 12, 13, 14, 15
dw 3, 4, 5, 6, 7, 8, 9, 10


IRQ2 should never fire so the priority assigned to it is meaningless. In
the example tables I've kept its priority as 2 in each case for no
reason other than it looks better than -1. ;-)
--
James Harris
Rod Pemberton
2021-04-18 17:06:42 UTC
Permalink
On Sun, 18 Apr 2021 11:45:42 +0100
Post by James Harris
Post by Rod Pemberton
On Sat, 17 Apr 2021 11:56:24 +0100
Post by James Harris
If nesting level is zero
Increment nesting level
Push more registers, as required
Loop while there are any interrupts waiting
Pick the IRQ /we/ want to treat as of highest priority
Delete it from waiting list
Enable interrupts
Handle the IRQ
Disable interrupts
Unmask the IRQ
Endloop
Pop the registers pushed earlier in this fragment
Decrement nesting level
Endif
Is this and the earlier portion intended to be re-entrant or should
the entire thing be disabled entirely from interrupts? I.e., it
seems you pushed CLI/STI down to just prior to calling the IRQ
routine.
I am not completely sure what you mean there. It may make the
proposal clearer if I write it as real assembly code. Bear in mind
that this is being made up on the fly. It may/will contain bugs but
should make the intention much clearer.
Actually, I didn't read that clearly enough. Sorry.

I assumed "Disable interrupts" was prior to "Handle the IRQ" and "Enable
interrupts" was after. So, I was asking why "Disable interrupts" wasn't
immediately after, e.g., "interrupt_0x20", and why "Enable interrupt"
wasn't just prior to IRET. I.e., it appeared to me that you were making
the outer portion of the routine interrupt-able or perhaps re-entrant.

Apparently, it's the inner portion where you're doing that. So, I'll
ask the reverse. Why are you letting the IRQ handler routine be
interrupted?
Post by James Harris
Let's say that IRQs 0-15 have been mapped to trigger interrupts 32-47
(0x20-0x2F). Here's the initial code for IRQ0.
interrupt_0x20: ;PIT, IRQ0
push eax
mov eax, 0x20
jmp pic0_manager
The other IRQs would be similar. For example, the initial code for
the mouse (IRQ12) would be
interrupt_0x2C: ;Mouse, IRQ12
push eax
mov eax, 0x2C
jmp pic1_manager
interrupt_0x20:
mov [int_no], 0x20
mov [mask_val], 0x...
jmp pic0_manager

Why not use a variable(s) in memory?
I.e., no need to corrupt eax and restore it later.

The mask value should be computable and you should be able to hardcode
it here as well.
Post by James Harris
Basically the code fragments would save the interrupt number in EAX
and jump to suitable code to deal with the PIC which woke us up. Here
is that code for the master PIC.
...
Post by James Harris
;************************************************************
;
; pic0_manager
;
;************************************************************
push ebx
push ecx
push edx
At some point, this will likely become a PUSHA ... ;)
If for no other reason, than proactive safety.
Post by James Harris
;Zero-base the IRQ number by subtracting that of IRQ0
lea ecx, [eax - 0x20] ;Set ECX to IRQ id (range 0 to 7)
cmp ecx, 7 ;Is really in range 0 to 7?
ja panic ;Abort if error in kernel code
Why wouldn't it be in the range 0 to 7? Since you're passing in the
correct value for the interrupt in eax, how could it not be correct?
Bad coding? ...
Post by James Harris
;Make the IRQ number in ECX into a mask bit
mov ebx, 1
shl ebx, ecx
If you precomputed the mask earlier, and hard coded it within each
interrupt_0x.., then ...
Post by James Harris
;Inhibit this interrupt
mov al, [pic0_mask] ;Fetch current mask
or al, bl ;Or in this bit
mov bl, [mask]

you could simply load the in-memory mask value here. Yes?
Post by James Harris
mov [pic0_mask], al ;Save new mask
mov dx, 0x21 ;Master PIC's data port
call out_dx_al ;Update the PIC's mask
<snip>
...

--
James Harris
2021-04-20 11:14:44 UTC
Permalink
Post by Rod Pemberton
On Sun, 18 Apr 2021 11:45:42 +0100
Post by James Harris
Post by Rod Pemberton
On Sat, 17 Apr 2021 11:56:24 +0100
...
Post by Rod Pemberton
Why are you letting the IRQ handler routine be
interrupted?
Interrupts need to be enabled so that the routine gets notified of (i.e.
interrupted by) other IRQs which we want to treat as of higher priority.
And in terms of interrupt response it's best to have IRQs disabled for
as short a time as possible.
Post by Rod Pemberton
Post by James Harris
Let's say that IRQs 0-15 have been mapped to trigger interrupts 32-47
(0x20-0x2F). Here's the initial code for IRQ0.
interrupt_0x20: ;PIT, IRQ0
push eax
mov eax, 0x20
jmp pic0_manager
The other IRQs would be similar. For example, the initial code for
the mouse (IRQ12) would be
interrupt_0x2C: ;Mouse, IRQ12
push eax
mov eax, 0x2C
jmp pic1_manager
mov [int_no], 0x20
mov [mask_val], 0x...
jmp pic0_manager
Why not use a variable(s) in memory?
DS would need to be set correctly for those instructions to refer to
kernel memory. I think the only segment registers the CPU will have set
will be CS and SS.
Post by Rod Pemberton
I.e., no need to corrupt eax and restore it later.
The mask value should be computable and you should be able to hardcode
it here as well.
That's not a bad idea. I may do something like that.
Post by Rod Pemberton
Post by James Harris
Basically the code fragments would save the interrupt number in EAX
and jump to suitable code to deal with the PIC which woke us up. Here
is that code for the master PIC.
...
Post by James Harris
;************************************************************
;
; pic0_manager
;
;************************************************************
push ebx
push ecx
push edx
At some point, this will likely become a PUSHA ... ;)
If for no other reason, than proactive safety.
Maybe. But PUSHA will likely be significantly slower. Why incur the
extra cost if it's not needed?
Post by Rod Pemberton
Post by James Harris
;Zero-base the IRQ number by subtracting that of IRQ0
lea ecx, [eax - 0x20] ;Set ECX to IRQ id (range 0 to 7)
cmp ecx, 7 ;Is really in range 0 to 7?
ja panic ;Abort if error in kernel code
Why wouldn't it be in the range 0 to 7? Since you're passing in the
correct value for the interrupt in eax, how could it not be correct?
Bad coding? ...
/Defensive/ coding. Especially when the routines are under development.
But you are right that such checks should not be necessary.
Post by Rod Pemberton
Post by James Harris
;Make the IRQ number in ECX into a mask bit
mov ebx, 1
shl ebx, ecx
If you precomputed the mask earlier, and hard coded it within each
interrupt_0x.., then ...
Post by James Harris
;Inhibit this interrupt
mov al, [pic0_mask] ;Fetch current mask
or al, bl ;Or in this bit
mov bl, [mask]
you could simply load the in-memory mask value here. Yes?
Yes. I may do that.
--
James Harris
wolfgang kern
2021-04-20 12:29:06 UTC
Permalink
On 20.04.2021 13:14, James Harris wrote:
...
Post by James Harris
Post by James Harris
Let's say that IRQs 0-15 have been mapped to trigger interrupts 32-47
(0x20-0x2F). Here's the initial code for IRQ0.
interrupt_0x20: ;PIT, IRQ0
    push eax
    mov eax, 0x20
    jmp pic0_manager
The other IRQs would be similar. For example, the initial code for
the mouse (IRQ12) would be
interrupt_0x2C: ;Mouse, IRQ12
    push eax
    mov eax, 0x2C
    jmp pic1_manager
what I had for the common stuff in PM (only four byte each):
IRQ_0:
push 00 ;byte (sign-extended)
jmp short common
IRQ_1:
push 01
jmp short common
...
IRQ_c:
push 4c ;JFI b6 tells it's on the second PIC
jmp short common

so the table is also easy with aligned four bytes
and common starts with:
XCHG ecx,[esp] ;aka push ecx and get IRQ-# into ecx
... ;more common stuff like save regs here
and it creates the mask for event flags setting and use a
hardcoded branch LUT for the hardware responder which all
end with a jump to common EOI (0..7/8..F)

and I used the same for true-RM.
Post by James Harris
Maybe. But PUSHA will likely be significantly slower. Why incur the
extra cost if it's not needed?
on AMD: PUSHA take the same time as three individual pushes,
but POPA may take a bit longer.
__
wolfgang
Rod Pemberton
2021-04-21 01:47:08 UTC
Permalink
On Tue, 20 Apr 2021 14:29:06 +0200
Post by wolfgang kern
push 00 ;byte (sign-extended)
jmp short common
push 01
jmp short common
...
push 4c ;JFI b6 tells it's on the second PIC
jmp short common
I use a MOV and JMP.

For DJGPP C code (i.e., GCC), I use a LEAVE instruction first, followed
by the MOV and JMP. DJGPP inserts some extra code for C's prolog/epilog
for which there is no C directive to eliminate.

OpenWatcom C code doesn't need the LEAVE instruction. It's clean,
justa MOV and JMP. The OpenWatcom code is for WASM which matches MASM
assembly syntax.

An early, 2006 version of my code, in inlined assembly (GNU GAS for
GCC) is posted at the link below. The central routine it called is
posted too. (a.o.d. Jun 24, 2006)

https://groups.google.com/g/alt.os.development/c/KUHv9fY7u_c/m/OP0KGvmIg98J
Post by wolfgang kern
so the table is also easy with aligned four bytes
XCHG ecx,[esp] ;aka push ecx and get IRQ-# into ecx
... ;more common stuff like save regs here
and it creates the mask for event flags setting and use a
hardcoded branch LUT for the hardware responder which all
end with a jump to common EOI (0..7/8..F)
and I used the same for true-RM.
That's a nice use for XCHG.

--
wolfgang kern
2021-04-22 20:05:55 UTC
Permalink
Post by Rod Pemberton
On Tue, 20 Apr 2021 14:29:06 +0200
Post by wolfgang kern
push 00 ;byte (sign-extended)
jmp short common
push 01
jmp short common
...
push 4c ;JFI b6 tells it's on the second PIC
jmp short common
I use a MOV and JMP.
?? MOV at the very start of an IRQ-routine ??
Post by Rod Pemberton
For DJGPP C code (i.e., GCC), I use a LEAVE instruction first, followed
by the MOV and JMP. DJGPP inserts some extra code for C's prolog/epilog
for which there is no C directive to eliminate.
OpenWatcom C code doesn't need the LEAVE instruction. It's clean,
justa MOV and JMP. The OpenWatcom code is for WASM which matches MASM
assembly syntax.
LEAVE at start? would this stack destroyer do any good ? :)
Post by Rod Pemberton
An early, 2006 version of my code, in inlined assembly (GNU GAS for
GCC) is posted at the link below. The central routine it called is
posted too. (a.o.d. Jun 24, 2006)
https://groups.google.com/g/alt.os.development/c/KUHv9fY7u_c/m/OP0KGvmIg98J
I see:
void isr00_base(void)
{
__asm__ (
"leave\n"
"movb $0x00,__ISR_no\n"
"jmp _ISR_IRQ_core\n"
);
}

and exactly such required detours made me hate C !

and my IDT contains INT-gates for HW-IRQs and Exceptions,
and not a single CALL-Gate.
Post by Rod Pemberton
Post by wolfgang kern
so the table is also easy with aligned four bytes
XCHG ecx,[esp] ;aka push ecx and get IRQ-# into ecx
... ;more common stuff like save regs here
and it creates the mask for event flags setting and use a
hardcoded branch LUT for the hardware responder which all
end with a jump to common EOI (0..7/8..F)
and I used the same for true-RM.
That's a nice use for XCHG.
yeah, even not a fast instruction it saves a few lines of code.
__
wolfgang
Rod Pemberton
2021-04-23 20:28:57 UTC
Permalink
On Thu, 22 Apr 2021 22:05:55 +0200
Post by wolfgang kern
Post by Rod Pemberton
On Tue, 20 Apr 2021 14:29:06 +0200
Post by wolfgang kern
push 00 ;byte (sign-extended)
jmp short common
push 01
jmp short common
...
push 4c ;JFI b6 tells it's on the second PIC
jmp short common
I use a MOV and JMP.
?? MOV at the very start of an IRQ-routine ??
You PUSH the interrupt number and XCHG it into ECX.

I save the interrupt number to an address in memory via MOV to be
accessed by C code as a named variable. The C compiler could generate
a MOV back into ECX for that, or some other register.
Post by wolfgang kern
Post by Rod Pemberton
For DJGPP C code (i.e., GCC), I use a LEAVE instruction first,
followed by the MOV and JMP. DJGPP inserts some extra code for C's
prolog/epilog for which there is no C directive to eliminate.
OpenWatcom C code doesn't need the LEAVE instruction. It's clean,
justa MOV and JMP. The OpenWatcom code is for WASM which matches
MASM assembly syntax.
LEAVE at start? would this stack destroyer do any good ? :)
I would love to get rid of the LEAVE instruction.

As stated, it's needed to work with DJGPP C code for procedures.

It's possible that I overlooked some compiler option to eliminate the
necessity for LEAVE here. Other C compilers have options which
eliminate the need for it.
Post by wolfgang kern
Post by Rod Pemberton
An early, 2006 version of my code, in inlined assembly (GNU GAS for
GCC) is posted at the link below. The central routine it called is
posted too. (a.o.d. Jun 24, 2006)
https://groups.google.com/g/alt.os.development/c/KUHv9fY7u_c/m/OP0KGvmIg98J
void isr00_base(void)
{
__asm__ (
"leave\n"
"movb $0x00,__ISR_no\n"
"jmp _ISR_IRQ_core\n"
);
}
and exactly such required detours made me hate C !
and my IDT contains INT-gates for HW-IRQs and Exceptions,
and not a single CALL-Gate.
Well, I may (re)consider reprogramming the PICs going forward.

If I reprogram the PICs as everyone else does for their OSes, then I
could use both CALL-gates and INT-gates in the IDT. Use of the
INT-gate would eliminate a rather large block of C code too, used by my
central interrupt routine. Instead of one long central routine, the
code would be separated into two shorter central routines.

Since, I didn't reprogram the PICs, my IDT is setup for CALL-gates
only, currently (IIRC). I know there was a reason for not
reprogramming the PICs originally, but I don't recall why anymore. I
may have a note in my code somewhere. However, today, without
remembering my original design decisions (circa 2006), it seems like a
bit of a waste/unnecessary and/or overkill on my part to do what I did
... I clearly failed to take the path of least resistance.

I haven't worked on my OS in a few years now. It took forever and will
take forever to be anything other than a rudimentary OS demonstrator,
which is obsoleted by the time I'm done.

Another C program I wrote would actually work better as a real world OS
as it's loaded on top of 16-bit DOS and 32-bit DPMI, much like Windows
98/SE/ME. Of course, it would be limited to 32-bits, but that would be
a choice accepted by me or anyone who would use it.

Even if I do resume development at some point, the code is for 32-bit
x86. I'd likely need to rework some x86 code for 64-bit, which might be
easy to do on 64-bit Linux, but would be a problem my development
environment which is 16-bit DOS + 32-bit DPMI. I.e., I may need to
migrate my OS development to Linux.

I have no idea what type of problems UEFI may cause, or what
issues changes to PC hardware may cause.

At this point in time, I'd rather let the code bit-rot and work on
other projects or ideas. After some recent problems recompiling code
for Linux, Linux which was working well, is now working beautifully.
So, I'll probably, at some point in the future, migrate my C code to
Linux and proceed from there. Most of my utilities and projects should
work just fine, except for my OS which accesses hardware directly.

--
wolfgang kern
2021-04-24 01:33:32 UTC
Permalink
Post by Rod Pemberton
Post by wolfgang kern
Post by Rod Pemberton
Post by wolfgang kern
push 00 ;byte (sign-extended)
...
Post by Rod Pemberton
Post by wolfgang kern
Post by Rod Pemberton
I use a MOV and JMP.
?? MOV at the very start of an IRQ-routine ??
You PUSH the interrupt number and XCHG it into ECX.
I save the interrupt number to an address in memory via MOV to be
accessed by C code as a named variable. The C compiler could generate
a MOV back into ECX for that, or some other register.
I don't get that. an IRQ works as follow:
PUSH flags
PUSH IP
PUSH CS

so the IRQ-routine starts with this three things on stack.
if you do a MOV reg you destroy an unsaved register so you can only do a
MOV mem to [ESP+...] because only SS is known here.

So what do you think is shorter faster:
6A 4C PUSH 4c or
67 C6 04 24 4c MOV [esp],4c or more worse in your case
C7 04 24 4C 00 00 00 MOV [esp],0000004c

oh I forgot that C abuses/reserves EBP along all lines ...
but this doesn't make it any better.

...
Post by Rod Pemberton
Well, I may (re)consider reprogramming the PICs going forward.
If I reprogram the PICs as everyone else does for their OSes, then I
could use both CALL-gates and INT-gates in the IDT. Use of the
INT-gate would eliminate a rather large block of C code too, used by my
central interrupt routine. Instead of one long central routine, the
code would be separated into two shorter central routines.
Since, I didn't reprogram the PICs, my IDT is setup for CALL-gates
only, currently (IIRC). I know there was a reason for not
reprogramming the PICs originally, but I don't recall why anymore. I
may have a note in my code somewhere. However, today, without
remembering my original design decisions (circa 2006), it seems like a
bit of a waste/unnecessary and/or overkill on my part to do what I did
... I clearly failed to take the path of least resistance.
my main reason for keeping original PIC and IVT were RM BIOS-functions.
but PM asks for relocating IRQs to avoid conflicts with exceptions.
Post by Rod Pemberton
I haven't worked on my OS in a few years now. It took forever and will
take forever to be anything other than a rudimentary OS demonstrator,
which is obsoleted by the time I'm done.
Another C program I wrote would actually work better as a real world OS
as it's loaded on top of 16-bit DOS and 32-bit DPMI, much like Windows
98/SE/ME. Of course, it would be limited to 32-bits, but that would be
a choice accepted by me or anyone who would use it.
all such problems disappeared after I wrote my own "Hi-mem".
I made it partly DOS-compatible but added more features because there
were so many unused bytes left in the reserved 4KB block. Still in use.
Post by Rod Pemberton
Even if I do resume development at some point, the code is for 32-bit
x86. I'd likely need to rework some x86 code for 64-bit, which might be
easy to do on 64-bit Linux, but would be a problem my development
environment which is 16-bit DOS + 32-bit DPMI. I.e., I may need to
migrate my OS development to Linux.
I don't like Loonix, too many cooks ...
Post by Rod Pemberton
I have no idea what type of problems UEFI may cause, or what
issues changes to PC hardware may cause.
At this point in time, I'd rather let the code bit-rot and work on
other projects or ideas. After some recent problems recompiling code
for Linux, Linux which was working well, is now working beautifully.
So, I'll probably, at some point in the future, migrate my C code to
Linux and proceed from there. Most of my utilities and projects should
work just fine, except for my OS which accesses hardware directly.
everything* can be done with Java and C
*) except an OS !!!
__
wolfgang
Rod Pemberton
2021-04-24 06:12:55 UTC
Permalink
On Sat, 24 Apr 2021 03:33:32 +0200
Post by wolfgang kern
Post by Rod Pemberton
Post by wolfgang kern
Post by Rod Pemberton
Post by wolfgang kern
push 00 ;byte (sign-extended)
I use a MOV and JMP.
?? MOV at the very start of an IRQ-routine ??
You PUSH the interrupt number and XCHG it into ECX.
I save the interrupt number to an address in memory via MOV to be
accessed by C code as a named variable. The C compiler could
generate a MOV back into ECX for that, or some other register.
PUSH flags
PUSH IP
PUSH CS
so the IRQ-routine starts with this three things on stack.
if you do a MOV reg you destroy an unsaved register so you can only
do a MOV mem to [ESP+...] because only SS is known here.
There is no "MOV reg" here. This is a "MOV m8, imm8" e.g. C6 /0. The
registers aren't accessed.


Hopefully, this will explain it better. I've included WASM inlined
assembly too, which is like MASM. Maybe, it's an easier read as
the x86 instruction arguments aren't reversed.

"_ISR_no" without quotes is a named variable declared as a "char" in C
which corresponds to an imm8 immediate for x86 assembly, i.e., it's an
unsigned 8-bit byte:

volatile unsigned char _ISR_no=0;

volatile <- explained further below ...
unsigned <- means the same thing as in x86 assembly
char <- a byte of 8-bits, because x86 imm8 is 8-bits
_ISR_no <- name or label used to access variable in C
= 0 <- clear the value to zero via assignment =

Below, in the inlined assembly syntax for C, _ISR_no needs an extra
underscore __ISR_no to be referenced by assembly code. They're the same
memory address, same variable.

I posted interrupt 70h so you can see where the interrupt value of 70h
is located in the code.

Also, the OpenWatcom section is in inlined WASM which should look just
like MASM, hopefully.

Note that for the Motorola syntax (GAS) in the DJGPP section, the x86
instruction argument order is /reversed/ from the Intel syntax (WASM).

void __declspec(naked) isr70_base(void)
{
#ifdef __DJGPP__
__asm__ (
LEAVE
"movb $0x70,__ISR_no\n"
"jmp _ISR_IRQ_core\n"
);
#endif
#ifdef __WATCOMC__
_asm {
.386
mov byte ptr _ISR_no,070h
jmp ISR_IRQ_core
}
#endif
}

So, here, via the "mov byte ptr _ISR_no, 070h" I move the interrupt
number 070h, which is an imm8 immediate/constant, into memory at
__ISR_no. This MOV should be compiled as a C6 MOV of an imm8 value of
70h, into an m8 memory address for a byte/imm8, using the memory address
for __ISR_no. The C6 MOV can also move an imm8 into r8 register, but
that's not what was requested to be compiled here.

In C, _ISR_no is accessed like any other C variable. E.g.,
if(_ISR_no==0x01) {...};

The "volatile" on the C declaration for _ISR_no tells the C compiler to
always check or load the value, as the C code might be eliminated via
optimization. This can occur because the value for _ISR_no is being
modified outside the scope of the C language, i.e., within the inlined
assembly.

The LEAVE is needed to eliminate DJGPP epilog/prolog code for
C procedures. The __declspec(naked) does the same for OpenWatcom
compiler. Since __declspec(naked) doesn't work for DJGPP, it is
eliminated by a C pre-processor #define for DJGPP.
Post by wolfgang kern
Post by Rod Pemberton
The C compiler could
generate a MOV back into ECX for that, or some other register.
I don't get that.
When _ISR_no is accessed in C, the C compiler can "MOV r8, m8" into
an r8 register the m8 value at _ISR_no. The compiler can select almost
any 8-bit register of it's choice, e.g., CL (or perhaps "MOVZX
r16/32, m8" into ECX since it's unsigned). Since the C compiler keeps
track of which registers are in-use by the C compiler and which
registers can or can't be destroyed/preserved/clobbered, the C compiler
will produce appropriate wrapper code to save or restore registers as
is needed, e.g. PUSH/POP ECX around CL code. Most C compilers clear
the full register before an 8-bit load or they use MOVZX/MOVSX. So,
the MOV would effectively be to ECX even though the value is 8-bit.

--
wolfgang kern
2021-04-24 09:04:17 UTC
Permalink
On 24.04.2021 08:12, Rod Pemberton wrote:
...
Post by Rod Pemberton
There is no "MOV reg" here. This is a "MOV m8, imm8" e.g. C6 /0. The
registers aren't accessed.
you can only use a variable which is on stack because DS is unknown yet.
Post by Rod Pemberton
Hopefully, this will explain it better. I've included WASM inlined
assembly too, which is like MASM. Maybe, it's an easier read as
the x86 instruction arguments aren't reversed.
not really better to understand for me :)

Let me repeat:
...
so you can only do a MOV mem to [ESP+...] because only SS is known here.
...
So what do you think is shorter faster:
6A 4C PUSH 4c or
C6 44 24 xx 4c MOV byte [esp+xx],4c
C6 45 xx 4C MOV byte [ebp+xx],4c
C6 05 yy yy yy yy MOV byte [yyyy_yyyy],4c ;assume DS is still flat.
Post by Rod Pemberton
"_ISR_no" without quotes is a named variable declared as a "char" in C
which corresponds to an imm8 immediate for x86 assembly, i.e., it's an
OK, so all named variables reside on the stack then ?
I wont believe that ...
Or does C quietly assume that DS is always flat and never changes ?
now this wont work for me ...
__
wolfgang
Rod Pemberton
2021-04-25 07:25:20 UTC
Permalink
On Sat, 24 Apr 2021 11:04:17 +0200
Post by wolfgang kern
Post by Rod Pemberton
There is no "MOV reg" here. This is a "MOV m8, imm8" e.g. C6 /0.
The registers aren't accessed.
you can only use a variable which is on stack because DS is unknown yet.
I'm confused by this statement.

I suspect you're talking about 16-bit RM, where DS may have been set to
a segment before the interrupt, which may be different from the segment
required to access the variable. I.e., must set DS to the segment for
variable prior to access. (My OS doesn't use RM.)

It's possible that you're saying that the DS segment for RM or DS
selector for PM is destroyed or changed or zero'd upon entry into an
interrupt routine. If so, that should generate a GP fault in PM if
null'd or invalid and accessed. (Hence, my interrupt routine
shouldn't work.) From the INT instruction flow, it looks like a switch
from v86 mode clears segment registers, which can't be used in PM
anyway, as the segment registers need to be loaded with selectors for
PM. (My OS doesn't use v86 mode.)

It's also possible that you're also talking about PM if the selector's
size limit for the segment isn't large enough to access the variable,
e.g., 64K for 16-bit PM, or a small segment limit for a PM selector.
I.e., the variable could be outside the address range for the DS
selector in PM, but that should trigger a GP fault, if the
variable is accessed. (Hence, my interrupt routine shouldn't work.)

AFAIK, (it's been a while since I've looked at the x86 manuals and
coded my OS), the in-use DS selector for PM isn't destroyed, changed,
zero'd, or nullified upon entry into the interrupt. Some registers are
saved on the stack for the interrupt. The call gate (not interrupt
gate) changes the CS selector and EIP offset, but not DS.

For my OS, the selector limits are maxed out, i.e., flat address space.
This works best with C code (more further below), especially for 32-bit
or 64-bit x86. All of my OS' interrupts are in PM. I don't switch to
non 32-bit code segments, e.g., RM, v86, or 16-bit PM, which might use
RM segment registers, nor have a limited segment size for the PM
selectors. I'm also not switching into small RM 64K sized segments,
nor into limited size PM segments. My OS is all flat, large address
space.
Post by wolfgang kern
Post by Rod Pemberton
Hopefully, this will explain it better. I've included WASM inlined
assembly too, which is like MASM. Maybe, it's an easier read as
the x86 instruction arguments aren't reversed.
not really better to understand for me :)
...
Post by wolfgang kern
...
so you can only do a MOV mem to [ESP+...] because only SS is known
here. ...
I'll accept that your DS is possibly incorrect when entering into an
interrupt due to your OS design, choice of processor mode, etc. Should
I ever have problems with my OS for this issue, I'll attempt to recall
this potential problem, and the need to set DS and do a PUSH, like
trivial x86 assembly programs.
Post by wolfgang kern
6A 4C PUSH 4c or
C6 44 24 xx 4c MOV byte [esp+xx],4c
C6 45 xx 4C MOV byte [ebp+xx],4c
C6 05 yy yy yy yy MOV byte [yyyy_yyyy],4c ;assume DS is still flat.
My goal was never shorter/faster, but "Does it work?"

To answer your first question, "PUSH 4c" is shorter at 2 bytes.

To answer your second question, I don't know which is faster. Early
processors took longer to load more bytes, i.e., slower. Modern
processor have so much parallelism, pipelining, and caching that the
longest sequence could be fastest, depending on things like register
stalls, cached micro-code, register remapping, etc

The last one is cheating, by the way. You didn't assume "DS is still
flat" for my code, saying "you can only use a variable which is on
stack because DS is unknown yet".
Post by wolfgang kern
Post by Rod Pemberton
"_ISR_no" without quotes is a named variable declared as a "char"
in C which corresponds to an imm8 immediate for x86 assembly, i.e.,
OK, so all named variables reside on the stack then ?
I wont believe that ...
C's file scope variables or global variables are placed in memory.

C's procedure local variables or auto variables are usually placed on a
LIFO stack for C. C doesn't require a stack, but most C implementations
use a stack, as it allows for recursion, and reduces memory usage.
Post by wolfgang kern
Or does C quietly assume that DS is always flat and never changes ?
now this wont work for me ...
C doesn't require that. However, in my opinion, C works best when
implemented that way.

C requires,

a) that C objects are contiguous allocations of bytes.
(a C byte is the minimum addressable unit of bits, e.g., 8 or 9 or
16 ...)

b) that address pointers correctly compare unequal for different C
objects, and compare equal for the same object - this complicates
code generation for segment:offset addressing when the address space is
small, e.g., for 16-bit RM x86

Implementing these two requirements for C is most easily done with a
large, flat address space, but that isn't required for C. Some old
mainframes store objects in different address spaces. As long as the
result of pointer comparisons are correct, it doesn't matter where the
C objects are stored. However, this complicates the assembly code for
comparing pointers/addresses.

Normally, these two requirements means that C for x86 is implemented
using just an offset into a large flat address space for x86. This
works nicely for 32-bit or 64-bit x86, but not for 16-bit. For
32-bit or 64-bit PM, the selector's base address (for the segment) is
simply ignored because the flat address space representable by the
offset is sufficiently large.

For a C compiler to be compliant for 16-bit RM, either 1) the pointer
range must be limited to 64K (x86 offset with fixed segment), or 2) the
compiler must do some manipulations to segment:offset addresses. The
latter is done by adjusting segment:offset so that the pointers compare
equal, if they reference the same C object. I.e., a C object at address
0x0600:0x0FFF won't compare equal if referenced as 0x500:0x1FFF.
This would be non-compliant. So, the segment:offset for both must be
converted or adjusted to be on the same segment to ensure a correct
pointer comparison.
--
Liberals are drunk and about to crash the car into a tree.
Conservatives need to slam on the brakes.
wolfgang kern
2021-04-25 06:57:51 UTC
Permalink
Post by Rod Pemberton
Post by wolfgang kern
you can only use a variable which is on stack because DS is unknown yet.
I'm confused by this statement.
I suspect you're talking about 16-bit RM, where DS may have been set to
a segment before the interrupt, which may be different from the segment
required to access the variable. I.e., must set DS to the segment for
variable prior to access. (My OS doesn't use RM.)
OK, now we found out why our points of view differ that much :)

No, I'm not talking about RM16 (even I use it as well for BIOS calls)
Yes, my DS change within PM32 and LM (alter RPL, range and limits to
isolate system data and have apart regions for every application).
And me too never used VM86.

So my OS can alter DS, while your C assume it's written in stone :)
Post by Rod Pemberton
AFAIK, (it's been a while since I've looked at the x86 manuals and
coded my OS), the in-use DS selector for PM isn't destroyed, changed,
zero'd, or nullified upon entry into the interrupt. Some registers are
saved on the stack for the interrupt. The call gate (not interrupt
gate) changes the CS selector and EIP offset, but not DS.
Yes, it alters and saves only EFL and CS:EIP, but it may swap SS:ESP if
RPL changes (until IRET)
Post by Rod Pemberton
For my OS, the selector limits are maxed out, i.e., flat address space.
...
my toolbox works with unlimited flat segments too which allow access to
everything (incl. reads of "forbidden" which may make the whole PC hung)

but for the OS delivered to clients I had to put in some security stuff,
and because I dislike paging I've chosen the segmentation path.

my hesitation to ever use C seem to came for a reason already long ago.
__
wolfgang
Rod Pemberton
2021-04-27 05:44:09 UTC
Permalink
On Sun, 25 Apr 2021 08:57:51 +0200
Post by wolfgang kern
Post by Rod Pemberton
Post by wolfgang kern
you can only use a variable which is on stack because DS is unknown yet.
I'm confused by this statement.
I suspect you're talking about 16-bit RM, where DS may have been
set to a segment before the interrupt, which may be different from
the segment required to access the variable. I.e., must set DS to
the segment for variable prior to access. (My OS doesn't use RM.)
OK, now we found out why our points of view differ that much :)
No, I'm not talking about RM16 (even I use it as well for BIOS calls)
Yes, my DS change within PM32 and LM (alter RPL, range and limits to
isolate system data and have apart regions for every application).
And me too never used VM86.
So my OS can alter DS
From your description, I suspect you're using a model similar to DJGPP.
(more further below)

Yes, it's definitely possible that if my OS were to advance to the point
of your OS, that my code would end up being broken ... I'm not at the
point where applications can execute under my OS. (Can I even call it
an OS? ...) When my OS can execute apps, a single address space
becomes more scary. I.e., it becomes easy for an errant application or
a hacker to corrupt the OS. Switching to a segmented model would break
my code.
Post by wolfgang kern
while your C assume it's written in stone :)
Well, it's more that I set it up that way. Originally, my OS C code
(32-bit) was for DJGPP C compiler only, which uses a segmented memory
model, which I chose not to use. Later, I added support for the
OpenWatcom C compiler which uses a flat memory model. I may be able to
implement segments with OpenWatcom's compiler's code, but it would be
limited as compared to DJGPP's segmented code. Dropping support for
OpenWatcom would allow me to migrate to Linux since DJGPP is GCC based.
However, OpenWatcom produces some very fast code. So, I'm usually torn
as to which way to proceed.

The C language works well with a flat memory model. I set up a flat
memory model for the entire computer without segmentation. With a
segmented memory model, each app can still be flat but just for the
app's address space. I.e., only a self-contained block of compiled C
code needs to have a flat address space for C to work well, and not the
entire computer as I did.

E.g., DJGPP splits application memory into multiple DS segments.
Each DJGPP app gets two DS segments. One DS segment is for accessing
hardware below 1MB. The other DS segment is the space allocated to a
DJGPP application. The app DS is limited in size. The limited size
allows multiple DJGPP apps to be in memory at the same time, each with
a different DS, e.g., #1 app 1MB size @ 1MB address, #2 app 1MB size @
2MB address, ... The limited size also prevents one app from accessing
the other apps in memory.
--
Liberals are drunk and about to crash the car into a tree.
Conservatives need to slam on the brakes.
wolfgang kern
2021-04-27 09:15:44 UTC
Permalink
... Switching to a segmented model would break my code.
segmentation would break windoze and Loonix too :)
they use generic addressing by paging.
Post by wolfgang kern
while your C assume it's written in stone :)
Well, it's more that I set it up that way. ..
OK, so my apology to C then, for seeing it more stupid as it is :)
The C language works well with a flat memory model. I set up a flat
memory model for the entire computer without segmentation. With a
segmented memory model, each app can still be flat but just for the
app's address space. I.e., only a self-contained block of compiled C
code needs to have a flat address space for C to work well, and not the
entire computer as I did.
E.g., DJGPP splits application memory into multiple DS segments
...
dunno what DJGPP is or does (it wont fit my demands anyway I'm afraid).

I allow max. 16 instances to be in memory at a time and because my
1mSec time-sliced MUX have to handle core functions as well there is
only space for eight simultaneous runung user modules [were enough].

While all core functions share only three DS-variants and one stack,
each user module get its own data region and share one user-stack.
this shared stack is only possible in my safe-version.
__
wolfgang
Rod Pemberton
2021-04-28 04:47:22 UTC
Permalink
On Tue, 27 Apr 2021 11:15:44 +0200
Post by wolfgang kern
I allow max. 16 instances to be in memory at a time and because my
1mSec time-sliced MUX have to handle core functions as well there is
only space for eight simultaneous runung user modules [were enough].
What happens if someone tries to load 19 instances? Queue up and wait?
FIFO stack?
--
Can we really become carbon neutral with SpaceX and Blue Origin burning
methane?
wolfgang kern
2021-04-28 06:32:18 UTC
Permalink
Post by Rod Pemberton
Post by wolfgang kern
I allow max. 16 instances to be in memory at a time and because my
1mSec time-sliced MUX have to handle core functions as well there is
only space for eight simultaneous running user modules [were enough].
What happens if someone tries to load 19 instances? Queue up and wait?
FIFO stack?
because I write my clients code modules, this can't/wont ever happen :)
KESYS isn't a GP-OS.
But if I try to start more than possible I'll see my own err-msg.

For a more open OS I'd recommend a transparent memory layout which let
the user decide which of "his" instances are to be closed.
__
wolfgang
Rod Pemberton
2021-04-28 23:14:17 UTC
Permalink
On Wed, 28 Apr 2021 08:32:18 +0200
Post by wolfgang kern
KESYS isn't a GP-OS.
Didn't you say KESYS was near it's end-of-life? ...

(If not, I'm sorry that I recalled that incorrectly.)

But, if KESYS is EOL, well, you could open-source it.

So, other people could make it GP. Or, not.

There are some open-source embedded OSes.

Those are clearly not full-featured OSes. OSes which exist for
specific situations have purpose.

Is KESYS similar in design to any major OS? e.g., DOS, Windows, Linux
--
Can we really become carbon neutral with SpaceX and Blue Origin burning
methane?
wolfgang kern
2021-04-29 09:02:18 UTC
Permalink
Post by Rod Pemberton
Post by wolfgang kern
KESYS isn't a GP-OS.
Didn't you say KESYS was near it's end-of-life? ...
(If not, I'm sorry that I recalled that incorrectly.)
I wont upgrade the sold OS-variants anymore, except that I'll care for
hardware changes within the next five years (until 2026 or I die).
Post by Rod Pemberton
But, if KESYS is EOL, well, you could open-source it.
Open Source? :) LMFAO trice! There isn't/weren't anything like Source.
the code itself is my source.
Post by Rod Pemberton
So, other people could make it GP. Or, not
would need a total change of design, better make one from scratch:

* Filesystem (not comparable to any known)
* memory management (total different to what's around)
* mixed code (RM16/Unreal/PM16/PM32/LM)
* KESYS comes only together with a complete PC _and_ client modules
as a desired solution (BIOS may be modified).
Post by Rod Pemberton
There are some open-source embedded OSes.
Those are clearly not full-featured OSes. OSes which exist for
specific situations have purpose.
Is KESYS similar in design to any major OS? e.g., DOS, Windows, Linux
Not to any you may have heard of. It could be seen as a concurrent to
this Siemens/Philips/and similar production line controls aka "Field
Programmable Controllers" even my solutions were designed/demanded but
not created by the clients. And it often uses special hardware add-ons
either designed by me or bought by client [ie: fast video acquisition].
__
wolfgang
Rod Pemberton
2021-04-29 20:00:37 UTC
Permalink
On Thu, 29 Apr 2021 11:02:18 +0200
Post by wolfgang kern
Post by Rod Pemberton
But, if KESYS is EOL, well, you could open-source it.
Open Source? :) LMFAO trice! There isn't/weren't anything like
Source. the code itself is my source.
You could release a binary package. It's not difficult to disassemble
binary and clean up the assembly. It just takes some time and desire.
Post by wolfgang kern
Post by Rod Pemberton
Is KESYS similar in design to any major OS? e.g., DOS, Windows, Linux
Not to any you may have heard of. It could be seen as a concurrent to
this Siemens/Philips/and similar production line controls aka "Field
Programmable Controllers" even my solutions were designed/demanded
but not created by the clients. And it often uses special hardware
add-ons either designed by me or bought by client [ie: fast video
acquisition]. __
PLCs are (or were) heavily used in the automotive industry for process
control. I never got to do anything related to that. Years ago, I had
a buddy who installed PLCs and programmed them. Unfortunately, I
didn't get to learn much from him, as he didn't like to talk about
work. My understanding was that they were used to implement ladder
logic.

Was fast video data acquisition (DAQ) one of the main selling points of
your OS? How did that come about? Why was such fast video acquisition
needed years ago? e.g., identifying parts on a conveyor? e.g., safety
shutoff? e.g., quality control?

Was where the other main uses for your OS?
--
I'd love to answer that, but it would violate the TOS. Liberals can't
handle the truth, because the truth hurts.
wolfgang kern
2021-04-29 22:17:03 UTC
Permalink
Post by Rod Pemberton
Post by wolfgang kern
Post by Rod Pemberton
But, if KESYS is EOL, well, you could open-source it.
Open Source? :) LMFAO trice! There isn't/weren't anything like
Source. the code itself is my source.
You could release a binary package. It's not difficult to disassemble
binary and clean up the assembly. It just takes some time and desire.
Sure not difficult if your disassembler is aware of mode-switches and
can distinguish what's data and what's code.
Post by Rod Pemberton
Post by wolfgang kern
Post by Rod Pemberton
Is KESYS similar in design to any major OS? e.g., DOS, Windows,
Linux
Not to any you may have heard of. It could be seen as a concurrent to
this Siemens/Philips/and similar production line controls aka "Field
Programmable Controllers" even my solutions were designed/demanded
but not created by the clients. And it often uses special hardware
add-ons either designed by me or bought by client [ie: fast video
acquisition]. __
PLCs are (or were) heavily used in the automotive industry for process
control. I never got to do anything related to that. Years ago, I had
a buddy who installed PLCs and programmed them. Unfortunately, I
didn't get to learn much from him, as he didn't like to talk about
work. My understanding was that they were used to implement ladder
logic.
Yes these free-programmable units knew only basic logic and I/O-Numbers.
I don't know any who was/is proud of his work on such stupid stuff.
because they were extremely slow, my OS+PC solution won by magnitudes.

I once built something similar (even much lesser stupid) with Z80.
sold about 400 units known as the MM8-cube, so I already had a base
clientele for the start of HEXWORK85 morphed then until KESYS2018.
Post by Rod Pemberton
Was fast video data acquisition (DAQ) one of the main selling points of
your OS? How did that come about? Why was such fast video acquisition
needed years ago? e.g., identifying parts on a conveyor? e.g., safety
shutoff? e.g., quality control?
all of this ...and much more.
mainly used in total autonome pharma production lines. Pills and medical
liquids need to pass several quality and safety checks at highest
possible speed because time is money and FDA laws are unforgiving. .

think about eye-drops in a tiny glass bottle... and now check if a
strange image is just an air-bubble or a glass-splinter. while
air-bubbles are pretty frequent and harmless, splinters are not!. and
all this during jittering conveyor moves with up to 8 bottles per Second.
Post by Rod Pemberton
Was where the other main uses for your OS?
there are others which requested top security for their confidential
data like science, military, government and other paranoid Bosses :)
I wrote my SAFE-variant especial for them.

there were also some trials with handicap support. A really interesting
challenge for my mechanical skills in addition to hard- and software.
__
wolfgang

James Harris
2021-04-21 09:13:56 UTC
Permalink
...
Post by wolfgang kern
and I used the same for true-RM.
Were you programming for the museum?!
Post by wolfgang kern
Post by James Harris
Maybe. But PUSHA will likely be significantly slower. Why incur the
extra cost if it's not needed?
on AMD: PUSHA take the same time as three individual pushes,
but POPA may take a bit longer.
OK. PUSHA has to write eight registers to cache whereas three pushes
would, naturally, write only three so what you say is surprising.
--
James Harris
wolfgang kern
2021-04-22 20:14:44 UTC
Permalink
On 21.04.2021 11:13, James Harris wrote:

...
Post by James Harris
Post by wolfgang kern
and I used the same for true-RM.
Were you programming for the museum?!
NO! but my OS uses both, true-RM for speed and PM16/32 for controls.
my RM and PM IRQ-routines work exactly equal on the same block of
variables. So any mode switches can't lose an event.
Post by James Harris
Post by wolfgang kern
Post by James Harris
Maybe. But PUSHA will likely be significantly slower. Why incur the
extra cost if it's not needed?
on AMD: PUSHA take the same time as three individual pushes,
but POPA may take a bit longer.
OK. PUSHA has to write eight registers to cache whereas three pushes
would, naturally, write only three so what you say is surprising.
AMD fused (~1998) PUSHA/PUSHAD beside some others ie: test-jcc pairs.
__
wolfgang
James Harris
2021-04-23 10:41:02 UTC
Permalink
Post by wolfgang kern
...
Post by James Harris
Post by wolfgang kern
and I used the same for true-RM.
Were you programming for the museum?!
NO! but my OS uses both, true-RM for speed and PM16/32 for controls.
my RM and PM IRQ-routines work exactly equal on the same block of
variables. So any mode switches can't lose an event.
Cool! That's the kind of thing which makes low-level programming fun.

I worked out something similar for handling interrupts in both PM and LM
but I haven't tried it out yet. I can't even remember why I did it. It
might have been that some modes or facilities are not available if the
OS is pure LM.

Do you need RM "for speed" in order to run RM apps natively rather than
in an emulator or in VM86?
Post by wolfgang kern
Post by James Harris
Post by wolfgang kern
Post by James Harris
Maybe. But PUSHA will likely be significantly slower. Why incur the
extra cost if it's not needed?
on AMD: PUSHA take the same time as three individual pushes,
but POPA may take a bit longer.
OK. PUSHA has to write eight registers to cache whereas three pushes
would, naturally, write only three so what you say is surprising.
AMD fused (~1998) PUSHA/PUSHAD beside some others ie: test-jcc pairs.
Well, I see in the Software Optimization Guide for AMD Family 10h and
12h Processors that PUSHA is VectorPath (microcode decode) with a
latency of 6. I presume that's in addition to any time taken to write
the registers to cache but maybe not.

AMD family 10h brings us up to products released in 2007 according to

https://en.wikipedia.org/wiki/AMD_10h
--
James Harris
wolfgang kern
2021-04-23 11:17:54 UTC
Permalink
Post by James Harris
Post by wolfgang kern
Post by James Harris
Post by wolfgang kern
and I used the same for true-RM.
Were you programming for the museum?!
NO! but my OS uses both, true-RM for speed and PM16/32 for controls.
my RM and PM IRQ-routines work exactly equal on the same block of
variables. So any mode switches can't lose an event.
Cool! That's the kind of thing which makes low-level programming fun.
I worked out something similar for handling interrupts in both PM and LM
but I haven't tried it out yet. I can't even remember why I did it. It
might have been that some modes or facilities are not available if the
OS is pure LM.
Do you need RM "for speed" in order to run RM apps natively rather than
in an emulator or in VM86?
Yes, true RM is faster when it comes to gather external events in
real-time like my clients data acquisition from line cameras.
NO paging and NO IOPL-checks in RM.
Post by James Harris
Post by wolfgang kern
Post by James Harris
Post by wolfgang kern
Post by James Harris
Maybe. But PUSHA will likely be significantly slower. Why incur the
extra cost if it's not needed?
on AMD: PUSHA take the same time as three individual pushes,
but POPA may take a bit longer.
OK. PUSHA has to write eight registers to cache whereas three pushes
would, naturally, write only three so what you say is surprising.
AMD fused (~1998) PUSHA/PUSHAD beside some others ie: test-jcc pairs.
Well, I see in the Software Optimization Guide for AMD Family 10h and
12h Processors that PUSHA is VectorPath (microcode decode) with a
latency of 6. I presume that's in addition to any time taken to write
the registers to cache but maybe not.
AMD family 10h brings us up to products released in 2007 according to
 https://en.wikipedia.org/wiki/AMD_10h
I have all these, I may remember a wrong date then, but right: 6 cycles,
the same as three individual pushes. Throughput is better for PUSHA.
__
wolfgang
Bernhard Schornak
2021-04-23 13:42:35 UTC
Permalink
Post by wolfgang kern
Post by wolfgang kern
Post by wolfgang kern
Maybe. But PUSHA will likely be significantly slower. Why incur the extra cost if it's not
needed?
on AMD: PUSHA take the same time as three individual pushes,
but POPA may take a bit longer.
OK. PUSHA has to write eight registers to cache whereas three pushes would, naturally, write
only three so what you say is surprising.
AMD fused (~1998) PUSHA/PUSHAD beside some others ie: test-jcc pairs.
Well, I see in the Software Optimization Guide for AMD Family 10h and 12h Processors that PUSHA is
VectorPath (microcode decode) with a latency of 6. I presume that's in addition to any time taken
to write the registers to cache but maybe not.
AMD family 10h brings us up to products released in 2007 according to
  https://en.wikipedia.org/wiki/AMD_10h
I have all these, I may remember a wrong date then, but right: 6 cycles,
the same as three individual pushes. Throughput is better for PUSHA.
Family 12 and up "translate" PUSH and POP (including PUSHA and
POPA) to MOV. Several MOVs (depending on the architecture) can
be issued simultaneously per clock. The execution time and the
size of simultaneously handled data depends on the implemented
memory interface.


Enjoy the weekend!

Bernhard Schornak
Rod Pemberton
2021-04-21 01:46:05 UTC
Permalink
On Tue, 20 Apr 2021 12:14:44 +0100
Post by James Harris
Post by Rod Pemberton
On Sun, 18 Apr 2021 11:45:42 +0100
Post by James Harris
Post by Rod Pemberton
On Sat, 17 Apr 2021 11:56:24 +0100
;************************************************************
;
; pic0_manager
;
;************************************************************
push ebx
push ecx
push edx
At some point, this will likely become a PUSHA ... ;)
If for no other reason, than proactive safety.
Maybe. But PUSHA will likely be significantly slower. Why incur the
extra cost if it's not needed?
Post by Rod Pemberton
Post by James Harris
;Zero-base the IRQ number by subtracting that of IRQ0
lea ecx, [eax - 0x20] ;Set ECX to IRQ id (range 0 to 7)
cmp ecx, 7 ;Is really in range 0 to 7?
ja panic ;Abort if error in kernel code
Why wouldn't it be in the range 0 to 7? Since you're passing in the
correct value for the interrupt in eax, how could it not be correct?
Bad coding? ...
/Defensive/ coding. Especially when the routines are under
development. But you are right that such checks should not be
necessary.
Ok. Fair enough, but I'll still recycle your words from just above:

"Maybe. But [it] will likely be [slightly] slower. Why incur the extra
cost if it's not needed?"

--
wolfgang kern
2021-04-17 19:40:46 UTC
Permalink
Post by James Harris
It's not possible to implement custom IRQ priorities on a traditional
PC, right?
That's what I've thought for many years but I might have finally found a
way to do it!
Here's the idea. Feel free to comment.
The principle is to get the PICs to inform us immediately of any IRQs
which are requesting service so we gain a complete picture of all the
IRQs which require attention.
Then, since we will know of all IRQs which need attention we can process
them in whatever order we want.
The key change from normal is for an interrupt to be EOId before it is
handled rather than afterwards, and for the IMR to be used to prevent
that same interrupt number from firing again.
But as well as masking an interrupt it would be added to a waiting list.
The waiting list could be as simple as a bit array with one bit for each
IRQ number or it could be more extensive but the point is that it would
keep track of which IRQs had been signalled but had not yet been processed.
To illustrate, here's a piece of pseudocode to show what would happen
when an interrupt comes in. All that would have happened before it
starts is that the IRQ number would have been determined and EOI would
have been issued automatically (the AEOI setting).
 Push a minimal set of registers
 Mask this interrupt in the relevant IMR
 Add this interrupt to the set of waiting interrupts
 ...
 Pop the registers pushed earlier
 iret
Aside from saving and restoring registers all that would be done would
be to mask off the current interrupt and to leave a note that the
interrupt is awaiting service.
If half a dozen interrupts all fire at once then each of the six would,
in turn, follow the above code path. That would leave all six masked off
and recorded as awaiting service.
But, naturally, something has to do the actual servicing. It could be
one of the interrupts or a high-priority task.
In the code below I'll make it one of the interrupts. I'll use a nesting
level to determine whether an interrupt is already being serviced or
not. Nesting level will be zero on initial entry and will be 1 if an
interrupt is already being serviced. There would only be the two levels.
The code in the above which is shown as "..." would be
 If nesting level is zero
   Increment nesting level
   Push more registers, as required
   Loop while there are any interrupts waiting
     Pick the IRQ /we/ want to treat as of highest priority
     Delete it from waiting list
     Enable interrupts
     Handle the IRQ
     Disable interrupts
     Unmask the IRQ
   Endloop
   Pop the registers pushed earlier in this fragment
   Decrement nesting level
 Endif
The loop would take interrupts in any order we wanted, thus implementing
custom prioritisation, and it would unmask each one as it completed it.
If any new interrupts fired while the code was running they would be
added to the waiting list and then would be processed in their turn
before the loop completed.
Once the waiting list was empty the loop would terminate. The code would
then, as normal, ireq back to whatever had been interrupted.
I think that's it. What do you think? Would it work? Can it be improved?
Something like this was my first idea for my task MUX :)
but as said in the other thread I treat all hardware IRQs with equal
priority and let the OS decide the order of reactions to IRQs on set
event-flags.
__
wolfgang
Loading...