Post by ***@gmail.comHi Waldek (mainly).
You made a comment that writing an assembler is
not difficult. I'm wondering what IS difficult for a
basic OS and tools - not necessarily exactly MSDOS
which needs to cope with segmentation, but
something that looks like MSDOS, regardless of
whether it runs on ARM or S/370.
Well, is real OS the big thing is device drivers. If you
look at recent Linux kernel source tree you will see that 964MB
is source code of drivers. The whole source tree is 1465MB,
so drivers are more than 65%. And the rest includes build
machinery, utilities and documentation which support everthing,
so also drivers. And there is 146MB arch subdirectory, which
contains support for various architectures. Much of architecture
dependent code is "driver like", it task is to handle device like
things like busses, timers, interrupt controllers etc.
In compilers hard part is optimization. When I compare gcc-4.8
to gcc-12.0 it seems that code produced by gcc-12.0 is probaby
about 10% more efficient than code from gcc-4.8. But C compiler
in gcc-12.0 is twice as large as C compiler in gcc-4.8. And
looking back, gcc-4.8 is much bigger than gcc-1.42 (IIRC C
compiler in gcc-1.42 was of order one megabyte in size).
gcc-12.0 produces more efficient code than gcc-1.42, but
probably no more than 2 times more efficient. Certainly,
code from gcc-12.0 is not 26 time more efficient than code
from gcc-1.42 (which would be the case if speed of object
code were simply proportional to compiler size). And in
turn gcc-1.42 generates more efficient code than simpler
compilers.
Both of the above are quite different than MSDOS, so let
me mention another aspect. "Bug compatibility" with
different system is hard. Namely, original devlopers
(in case of MSDOS Microsoft) code in a way that is
convenient to them, and say that "product is as is".
If you want to compete with MSDOS you need carefuly establish
what MSDOS is doing and then find a way to implement exact
same bahaviour in your product. This was learned hard
way by Wine folks. Original idea was: Linux system calls
provide equivalent functionality to Windows system calls,
so let us create a loader which can load PE executable and
provide tiny translation layer from Windows system calls
to Linux calls. Loader part went smoothly, but Wine
folks quickly discoverd that there were no "well written"
Windows programs: even "trivial" programs depended on
various tiny details of Windows interface. Do it differently
and the program will not work.
For MSDOS there are some specific troubles:
- interfaces were specified in assembler
- OS hand to run acceptably on small and by modern standard slow
machines
Looking at this, I think that there were a lot of companies
which could create something with comparable functionality
to MSDOS, so in this sense replicating MSDOS was not hard.
If you want good compatiblity, and efficiency, then things
go harder, but IIUC there were several companies that could
do this and some that actually did. But there is also
business aspect: Microsoft from the start used "tax" method.
Namely, manufactur had to pay moderate fee for each PC they
sold. So even if you got alternative to DOS you effectively
payed for DOS. And since Microsoft kept moderate prices, there
was price pressure on competitors, competing product had
to be significantly better than MSDOS to justify price.
And when comptitor (DR DOS) was doing well, Microsoft put
extra code in Windows to detect that Windows was not
running on top of MSDOS and produce error message.
Of course, there is also issue of size of whole enterprise.
MSDOS class system is approachable by single person, but not
in a weekend (and probably not in a month). Most people
lack sufficient motivation to spend needed effort given that
quite good alternative (Free DOS) is available with sources.
Post by ***@gmail.comPDOS/86 (OS): about 30,000 lines
PDPCLIB (C library): About 17,000 lines
SubC (C compiler): About 5,500 lines
as86 (assembler): About 13,000 lines
pdar (archiver): About 1000 lines
ld86 (linker): About 3000 lines
pdmake (make): About 2000 lines
The line counts look a bit high to me, given limited functionality
of what you have. Especially line counts for PDOS and as86
look high.
I have Minix sources, it has 6192 lines in header files, which
include C library headers. IIUC some include files are generated,
so it is not clear if they should be counted as true sources.
There is 7651 lines for bootloaders, 331 lines in mandatory
system configuration files, 38282 lines for kernel proper,
19868 lines for networking support, 47361 for system libraries
(including C library). There is also 18345 lines of test
code (I am not sure if you include test code in your line
counts).
Note that Minix includes its own drivers for popular devices
and the source code is both for 8086 and 386 (there are two
versions of assembler code, C code is common).
Originally Minix was written during 3 years of part-time
work by Andrew Tanenbaum. He had full time job at univerity
and simultanenousy wrote a book about operationg systems,
using Minix as example. The code I have is for an expanded
version compared to orignal, but probably not more than
twice as large as original.
Tanenbaum took advantage that his univerity developed a
compiler+related tools (linker, assembler) and used those
for Minix. He also used available Unix utilities. I am
not sure if command processor (shell) was written specially
for Minix, but it was not included in counts above.
Linux-0.01 is about 11000 lines of code, this includes driver
for "standard" hard disc, keyboard and serial port (it looks
that there are no floppy driver). There is paging and
multitasking. There is filesystem (Minix compatible). There
are no user level command or compilers, one needs to get them
separately. IIUC this is essentially original version as written
by Linus Torwalds in 6 months.
Wirth and Gutkneht in 1986-1988 period created system Oberon.
That included device drivers, Oberon compiler (Oberon is both
name of language used for implementation and name of the
whole system), file system and GUI. Many things in Oberon
compared to modern systems look primitive. But probably it
could do more than DOS. There was some cost: Oberon requires
32-bit machine with graphic (bitmapped) display. Originally
Oberon was written for processor from National Semiconductor
which is essentially forgotten now. However, code was ported
to 386 (IIUC it was not much more than retargetting compiler)
and there is more modern version using custom RISC processor.
Post by ***@gmail.comI am not very good with algorithms,
Do you really mean "I am not very good with programming"?
When programming you all time deal with algorithms.
Frankly, it seems that you spent quite a lot of time to
get to the point were you are now. And IIUC substantial
part of your codebase came from other folks. Examples above
shows that other folks in 2-3 years time got systems that
look more advanced than yours.
Post by ***@gmail.comnor do I know
much of the theory, so at the moment, only numbers
1 and 2 are within my capability.
Note that I am running up against the 640k limit
with PDOS/86. The OS and command processor
are taking up 300k or something, and when I
try to run pdmake (which opens another command
processor before running another program), I run
out of memory.
Real MSDOS kept COMMAND.COM on disk and loaded it only
when needed. In memory there was only small resident
stub (and of course kernel).
BTW: Do you mean 300k when compiled by Watcom or when
compiled by SubC? I would expect Watcom result to be
significantly smaller than result from SubC.
Post by ***@gmail.comI refuse to change the fundamental design to try to
alleviate the memory problems, and instead wish to
run the exact (*) same toolchain in either PM16 or
PM32 with the D bit set to indicate 16-bit.
If I go the PM32 route I am wondering whether I can
make fairly small (LOC) changes to PDOS/386 to make
it accommodate 16-bit (only) programs - the specific
MSDOS tools that have been linked with PDPCLIB - I
don't care about other MSDOS programs that don't
follow "the rules" (*).
(*) The rules aren't set in stone yet. PDPCLIB still
hardcodes 4-bit shifts which won't work on either
a Turbo 186 (8-bit shifts) or the above PM16/32
scenario, and it is only when the rules exist, and
PDPCLIB follows the rules, that I wish to throw
64 MB (to start with) at my MSDOS executables.
Any comment?
Well, it seems that you want to have troubles and
you have them. When writing for small and slow machines
you ether need good optimizing compiler or hand
optimized assembly at least for critical parts. When
size is main concern there are ways to trade some
speed to decrease code size. In particular, using
interpreted byte code one can reduce code size 2-3
times compared to good assemby. With approriate mix
of small amount of fast code (hand written assembly or
output from optimizing compiler) and byte code one
can get small and relatively fast program. Both
segmentation and MSDOS "compatibility" are liabilities,
they bring unnecessary complications.
Post by ***@gmail.comNote that the LOC are mostly the same for PDOS/386.
Only the assembler and linker change for those.
pdas - 6000 lines
pdld - 2000 lines
Thanks. Paul.
--
Waldek Hebisch