Discussion:
z/PDOS-generic
(too old to reply)
Paul Edwards
2024-07-18 15:07:52 UTC
Permalink
For 35+ years I have wondered why there was no MSDOS for
the mainframe. I now have an EBCDIC FAT32 file system on
FBA disks on the mainframe and an operating system that can
do basic manipulation, like typing files.

Search for z/PDOS-generic at https://pdos.org

PDOS-generic has never been fleshed out because I wasn't sure
if it was truly portable or whether I was missing something. The
mainframe is always my go-to place for proving portability.

I'm not sure where to go from here. I think I might get an Atari
clone operational under PDOS-generic (I already have the Amiga)
to try to prove the technique of zapping a BSS variable on load
to inform the executable of the new environment so that it doesn't
do a real trap and instead does a callback. Actually it's mainly on
the mainframe that I need to do that, as the Atari has a control
block on entry that I can fill in with the callback overrides. Note
that I have an Amiga mini-clone already using this technique,
which I run under qemu-m68k (ie user, not system) on my
Manjaro Linux on a Pinebook Pro (ARM). My main development
system is still Windows 2000 running under qemu on the PBP,
and I just remembered today that that gave me access to Outlook
Express which I used a long time ago for News, and it still works.
So I didn't need to get my ArcaOS operational after all (which
has Thunderbird).

BFN. Paul.
Grant Taylor
2024-07-19 03:40:29 UTC
Permalink
For 35+ years I have wondered why there was no MSDOS for the mainframe.
The answer is in the name.

MS-DOS

Microsoft DOS

Micro

micro-computers are the smallest end of the system with mainframes and
supers at the other end of the system.

IBM provided a Disk Operating System for early and / or smaller mainframes.

But Microsoft never provided DOS for mainframes.
--
Grant. . . .
Paul Edwards
2024-07-19 10:43:13 UTC
Permalink
Sure - but why not make it available anyway? What's the barrier
to someone doing that? No-one is interested? Too much work?
It didn't need to be Microsoft personally. And it can be written
in C to make things easier. Or even some other language - e.g.
CP/M was written in PL/M I think.

BFN. Paul.
Post by Grant Taylor
For 35+ years I have wondered why there was no MSDOS for the mainframe.
The answer is in the name.
MS-DOS
Microsoft DOS
Micro
micro-computers are the smallest end of the system with mainframes and
supers at the other end of the system.
IBM provided a Disk Operating System for early and / or smaller mainframes.
But Microsoft never provided DOS for mainframes.
--
Grant. . . .
Scott Lurndal
2024-07-19 16:18:13 UTC
Permalink
Post by Paul Edwards
Sure - but why not make it available anyway?
MS-DOS is, was, and always will be a toy. It's not even
a real operating system.

No mainframe user would ever be interested in something
so simplisticly useless.
BGB-Alt
2024-07-19 22:12:40 UTC
Permalink
Post by Scott Lurndal
Post by Paul Edwards
Sure - but why not make it available anyway?
MS-DOS is, was, and always will be a toy. It's not even
a real operating system.
No mainframe user would ever be interested in something
so simplisticly useless.
It has a FAT filesystem, MZ loader, and basic console printing and
memory allocation... These cover the main bases for what one needs for
an operating system.


Granted, if one wants memory protection and multiple processes/threads,
this is no longer sufficient as now the OS needs to be able to do all
the other stuff programs might want to be able to do.

Granted, other types of things one might need to deal with is how
programs should be able to interface with OS facilities and device drivers.


Say, for example:
Unix style: System calls identified by number, and treated like a
function call. Most devices are presented as file-like objects (mostly
using file operations or "ioctl()").

COM style interfaces: An object is given with various methods, and a
mechanism exists for mapping these method calls from userspace to kernel
space or between processes.


In my project, I used a hybrid approach, where a range of system-call
numbers were set aside for method calls. There is a system call used to
request an interface object for a given interface.

In this case, an interface ID is given as a pair of 64-bit numbers,
which may be interpreted as FOURCC's, EIGHTCC's, or a UUID/GUID. When
needed, it is possible to tell them apart by looking at bit patterns.
Current thinking is mostly that OS APIs would use FOURCC or EIGHTCC
pairs, whereas private interfaces would use GUIDs.

The object is presented (to the client application) with its VTable
mostly filled up with methods which merely exist to forward their
arguments to the corresponding system-call number (for their location
within the VTable).


Some other devices could present themselves with a file-like or
socket-like interface though.

Though, say, for things like GUI/audio/etc interfaces, a COM-like
interface routed directly over syscalls would have lower overhead, say,
than trying to shoe-horn it through message passing over a socket or
similar.

...
Scott Lurndal
2024-07-19 23:21:22 UTC
Permalink
Post by BGB-Alt
Post by Scott Lurndal
Post by Paul Edwards
Sure - but why not make it available anyway?
MS-DOS is, was, and always will be a toy. It's not even
a real operating system.
No mainframe user would ever be interested in something
so simplisticly useless.
It has a FAT filesystem
Poor performance, silly filename length limitations.
Post by BGB-Alt
, MZ loader,
whatever that might be.
Post by BGB-Alt
and basic console printing and
memory allocation... These cover the main bases for what one needs for
an operating system.
Not on a millon dollar mainframe.
Dan Cross
2024-07-19 23:31:32 UTC
Permalink
Post by Scott Lurndal
Post by BGB-Alt
Post by Scott Lurndal
Post by Paul Edwards
Sure - but why not make it available anyway?
MS-DOS is, was, and always will be a toy. It's not even
a real operating system.
No mainframe user would ever be interested in something
so simplisticly useless.
It has a FAT filesystem
Poor performance, silly filename length limitations.
Post by BGB-Alt
, MZ loader,
whatever that might be.
Post by BGB-Alt
and basic console printing and
memory allocation... These cover the main bases for what one needs for
an operating system.
Not on a millon dollar mainframe.
Please don't feed the troll. Or do; it's not like this
newsgroup gets much more traffic except for this guy's
weird dos clone and ramblings aout mainframes.

- Dan C.
BGB
2024-07-20 06:30:29 UTC
Permalink
Post by Scott Lurndal
Post by BGB-Alt
Post by Scott Lurndal
Post by Paul Edwards
Sure - but why not make it available anyway?
MS-DOS is, was, and always will be a toy. It's not even
a real operating system.
No mainframe user would ever be interested in something
so simplisticly useless.
It has a FAT filesystem
Poor performance, silly filename length limitations.
True enough.

But, I guess everyone thought 8.3 filenames were fine in the 80s and
early 90s (or, for some of us, might bring back memories of childhood
nostalgia or similar, a memory of the times before most everything went
over to free-form long filenames).


Personally, I suspect a limit of 32 or 64 characters would probably be
fine for most uses, though most modern systems have settled on a 256
character name limit.

However, given a lot of systems have settled on a 260 character
"maxpath" or similar, the practical use of a 256 character name limit is
debatable (one can only really use a full length filename in the root
directory, which is less useful).

If it were just me, I would assume a 32-character filename limit, and a
512 character maxpath.


Granted, a 32 character limit might seem imposing for people who prefer
to use the "Hey check it out, my filename is a whole sentence or
paragraph.txt" naming convention...
Post by Scott Lurndal
Post by BGB-Alt
, MZ loader,
whatever that might be.
The MS-DOS ".EXE" format...

It was useful on MS-DOS, granted, not so much at this point.


On more modern systems, this role is typically served by ELF or PE/COFF.

Where, PE/COFF was generally a COFF binary glued onto an MZ stub (which
traditionally displayed "This program can not be run in MS-DOS mode."
and exits).


In my own uses, I dropped the MZ EXE stub, beginning the file at the
'PE' marker. This isn't quite back to being COFF, as this typically
started at the machine-type ID. But, having a magic FOURCC here is
useful (typically 'PEL4' or similar in my current use).
Post by Scott Lurndal
Post by BGB-Alt
and basic console printing and
memory allocation... These cover the main bases for what one needs for
an operating system.
Forgot to mention, it also had:
keyboard input handling;
Optional support for ANSI escape codes;
...

Well, and a variety of built-in programs, like "edit", "fdisk", and
"format".
Post by Scott Lurndal
Not on a millon dollar mainframe.
Probably not...


I was more asserting that MS-DOS can be used as an operating system (and
was used as such, at one point, on PCs), not really defending that it
would make sense to run it on a mainframe.

So, yeah, how porting an MS-DOS variant to a mainframe would make any
sense, I don't know.


I guess technically, the MS-DOS source has been released, but given much
of it is 8086 assembler, how much use it is to try to port it, is
debatable...
John Ames
2024-07-22 14:51:54 UTC
Permalink
On Fri, 19 Jul 2024 23:21:22 GMT
Post by Scott Lurndal
Poor performance, silly filename length limitations.
I dunno, 8.3 is downright spacious compared to a number of actual
mainframe operating systems...
Dan Cross
2024-07-22 15:22:26 UTC
Permalink
Post by John Ames
On Fri, 19 Jul 2024 23:21:22 GMT
Post by Scott Lurndal
Poor performance, silly filename length limitations.
I dunno, 8.3 is downright spacious compared to a number of actual
mainframe operating systems...
I can't think of any now for which that would be true. Maybe
DOS/VS or something?

The idea of a PC operating system on a mainframe is silly. A
single-tasking, unprotected, glorified program loader like DOS
that provided synchronous, programmed IO would be hopelessly
inefficient for a heavy-use mainframe. It's one thing on a
cheap 8- or 16-bit micro where you don't care about wasting
cycles while the user thinks about what to type next. Quite
another on your big compute engine when you want to keep the
CPUs and IO devices running as close to capacity as you can, to
maximize the return on your multi-million dollar hardware
investment.

- Dan C.
John Ames
2024-07-22 16:07:30 UTC
Permalink
On Mon, 22 Jul 2024 15:22:26 -0000 (UTC)
Post by Dan Cross
I can't think of any now for which that would be true. Maybe
DOS/VS or something?
TENEX's six-character filename limit is the reason Colossal Cave
Adventure is also known as ADVENT ;)
Post by Dan Cross
The idea of a PC operating system on a mainframe is silly.
No argument there. But there's room in life for silliness.
Dan Cross
2024-07-22 17:37:56 UTC
Permalink
Post by John Ames
On Mon, 22 Jul 2024 15:22:26 -0000 (UTC)
Post by Dan Cross
I can't think of any now for which that would be true. Maybe
DOS/VS or something?
TENEX's six-character filename limit is the reason Colossal Cave
Adventure is also known as ADVENT ;)
Oh, I thought we were being specific to IBM mainframes,
which is almost certainly what the OP was talking about.

ITS certainly had six-character filenames, as did TOPS-10 IIRC,
but TENEX had no such limit; consider the existence of
<SYSTEM>DIRECTORY, for instance. Certainly, any unreasonably
short name limit did not survive into TOPS-20.

https://github.com/PDP-10/tenex/blob/master/pdf/TEN-SYS-2.pdf
suggests that the "primary name string" is of
"indefinite length".
Post by John Ames
Post by Dan Cross
The idea of a PC operating system on a mainframe is silly.
No argument there. But there's room in life for silliness.
Indeed. I don't think OP is making that distinction, though.

- Dan C.
Scott Lurndal
2024-07-22 18:07:19 UTC
Permalink
Post by Dan Cross
Post by John Ames
On Mon, 22 Jul 2024 15:22:26 -0000 (UTC)
Post by Dan Cross
I can't think of any now for which that would be true. Maybe
DOS/VS or something?
TENEX's six-character filename limit is the reason Colossal Cave
Adventure is also known as ADVENT ;)
Oh, I thought we were being specific to IBM mainframes,
which is almost certainly what the OP was talking about.
ITS certainly had six-character filenames, as did TOPS-10 IIRC,
but TENEX had no such limit; consider the existence of
<SYSTEM>DIRECTORY, for instance. Certainly, any unreasonably
short name limit did not survive into TOPS-20.
https://github.com/PDP-10/tenex/blob/master/pdf/TEN-SYS-2.pdf
suggests that the "primary name string" is of
"indefinite length".
Post by John Ames
Post by Dan Cross
The idea of a PC operating system on a mainframe is silly.
No argument there. But there's room in life for silliness.
Indeed. I don't think OP is making that distinction, though.
Agreed. Even the ANSI Magtape format had 17-character filenames
back in the day. Some older Burroughs systems were limited to 12
characters (six for pack/volume name and six for filename), but
large systems (e.g. B6500 et al) had a longer limit.

The original unix filesystem was limited to 14, IIRC.
Dan Cross
2024-07-22 19:38:41 UTC
Permalink
Post by Scott Lurndal
Post by Dan Cross
Post by John Ames
On Mon, 22 Jul 2024 15:22:26 -0000 (UTC)
Post by Dan Cross
I can't think of any now for which that would be true. Maybe
DOS/VS or something?
TENEX's six-character filename limit is the reason Colossal Cave
Adventure is also known as ADVENT ;)
Oh, I thought we were being specific to IBM mainframes,
which is almost certainly what the OP was talking about.
ITS certainly had six-character filenames, as did TOPS-10 IIRC,
but TENEX had no such limit; consider the existence of
<SYSTEM>DIRECTORY, for instance. Certainly, any unreasonably
short name limit did not survive into TOPS-20.
https://github.com/PDP-10/tenex/blob/master/pdf/TEN-SYS-2.pdf
suggests that the "primary name string" is of
"indefinite length".
Post by John Ames
Post by Dan Cross
The idea of a PC operating system on a mainframe is silly.
No argument there. But there's room in life for silliness.
Indeed. I don't think OP is making that distinction, though.
Agreed. Even the ANSI Magtape format had 17-character filenames
back in the day. Some older Burroughs systems were limited to 12
characters (six for pack/volume name and six for filename), but
large systems (e.g. B6500 et al) had a longer limit.
The original unix filesystem was limited to 14, IIRC.
Correct. Two bytes for the inode number, and 14 for
the filename, in a 16-byte directory entry. Fixed in
4BSD, where the 4.2 filesystem has a variable length
filename (up to 255 characters) and a "reclen" field
that points to the next (occupied) entry in any given
dir. Creating a new file in some directory basically
meant doing a first-fit search through the directory
file until one could find a suitably sized "slot".

Good times.

- Dan C.
John Ames
2024-07-22 18:18:57 UTC
Permalink
On Mon, 22 Jul 2024 17:37:56 -0000 (UTC)
Post by Dan Cross
ITS certainly had six-character filenames, as did TOPS-10 IIRC,
but TENEX had no such limit; consider the existence of
<SYSTEM>DIRECTORY, for instance. Certainly, any unreasonably
short name limit did not survive into TOPS-20.
I stand corrected...!
BGB
2024-07-22 19:16:26 UTC
Permalink
Post by John Ames
On Fri, 19 Jul 2024 23:21:22 GMT
Post by Scott Lurndal
Poor performance, silly filename length limitations.
I dunno, 8.3 is downright spacious compared to a number of actual
mainframe operating systems...
Looking some, it seems:
MS-DOS: 8.3
Commodore: 15.0
Apple ProDOS: 16.0
Apple Macintosh: 31.0 (HFS)
Early Unix: 14 (~ N.M where N+M+1 <= 14)

Whereas TENEX and some others were 6 character.
OS4000: 8 character
VAX/VMS (and others): 6.3
It seems 6.3 was fairly common on DEC OS's.

Others:
ISO 9660 30 (variable format, similar to Unix)
UDF: 255
FAT32 and NTFS: 256 (UTF-16)
EXT2/3/4: 256 (UTF-8)

For most uses, a 32 character limit would probably be fine.


In many Apple systems, file type and similar was given in a hidden
"resource fork" rather than encoded in the filename via a file extension
or similar. This seems to be a bit of weirdness fairly specific to Apple
systems.


For an experimental filesystem design of mine (not used much as of yet),
I had used 48-character base names (sufficient "most of the time"), with
an optional encoding for longer names.

Basically using free-form names following Unix-like conventions, albeit
with semi-mandatory file extensions more like in Windows land (binaries
typically use '.exe' and '.dll' extensions; however, unlike Unix style
shells, the file extension is not usually given when invoking a command;
and the extension will be inferred when loading the program).


However, it allows longer names using a scheme similar to FAT32 LFN's,
just with names encoded as UTF-8. Otherwise, the design was similar to
an intermediate between EXT2 and NTFS; though trying to avoid the sorts
of needless complexity seen in NTFS. The LFN's could be omitted, in
which case the name limit would be 48 bytes as UTF-8.


For directories, I went with organizing directory entries in an AVL tree:
Typical directories are not big enough to justify the relative
complexity of a B-Tree (unless aggregating the entire directory tree
structure into a shared B-Tree).
I had gone the route of using disk blocks to encode directories.
Many directories are still big enough that linear search is undesirable.

Hashed directory lookup seems to be popular, but I went with AVL here
(but, with balancing requirements relaxed to depth +/- 3 rather than +/-
1, to reduce the number of rotations needed).


For directory lookups, generally the tree is walked using a specialized
version of "strncmp()" over the 48 character base-name. Names are
encoded as UTF-8, and the "strncmp()" variant is designed to assume that
'char' is unsigned (the standard version could give different results
based on the signedness of 'char' or other factors).

Though, "memcmp()" could probably be used and would give the same
results here (with names NUL padded to 48 bytes as-needed).


As I saw it, fully variable length directory entries (like seen in EXT2)
are also undesirable.
So, in this case, directory entries are 64 bytes, with 48 bytes for the
name, and the rest for tree management data and holding inode index.

Another major structure is the inode table, which:
Is semi-recursive, the inode table itself has an inode,
is allocated much like a file.
Inodes are built from a tagged structure.
Partially inspired by NTFS.
Currently uses a block-allocation scheme similar to EXT2.
Small table of block indices:
Index 0..15: Points directly at target block;
Index 16..23: One level of indirection.
Index 24..27: Two levels of indirection.
Index 28/29: Three levels of indirection.
Index 30: Four levels of indirection.
Index 31: Five levels of indirection.
Span-based allocation was a close second place.
The tagged inode structure could also allow for span-based files.
But, I went with an EXT2 like scheme for now.
Span based allocation would have been more complicated.

The current implementation mostly assumes 512 byte inodes, but
technically it is variable.

In the block indirection tables, unlike EXT2, the lower-levels of
indirection have "shadowed" spaces in the higher levels of indirection.
This was mostly for sake of simplicity (it seemed simpler to just waste
some of the table entries than to go the EXT2 route). Theoretically, the
deeper tables could mirror the shallower tables, but this wasn't done in
the current implementation (easier to not bother).

Similar to filesystems like EXT2 and similar, the first 16 inodes are
currently special/reserved, and used mostly to encode filesystem
metadata (inode table, inode bitmap, root directory, block bitmap, ...).
However, one minor difference being that block numbering is relative to
the start of the partition (so, for example, block 0 in this case is a
NULL block, but technically the superblock exists at this location).
Higher numbered inodes would be used for files and similar.

For now, the special inodes are identified by magic index, unlike the
NTFS MFT which encodes a name for these special entries (maybe later
could add a "magic ID" tag or similar).

TODO might be to consider file compression. No immediate plans for
journaling support.



While a case could have been made for "just use EXT2 or similar", my
main development system is Windows, so pretty much any choice (other
than FAT32 or NTFS or similar) is a similar level of hassle.

So:
FAT32, mostly what I had ended up using thus far.
But, with some hacks to support things like symlinks and similar.
NTFS, possible, but significant needless complexity.
Main issue is that it has too much needless complexity.
EXT2, mostly more sane than NTFS, but still some questionable choices.
ExFAT, doesn't address the issues in my case.
Basically FAT but with redesigned directories
Still patent encumbered.
(For FAT32 and the core of NTFS, patents have expired).


Thus far, had been using FAT32, but using cruft to try to add things
like symlinks and similar on top of FAT32 is ugly.

...
Scott Lurndal
2024-07-22 20:14:29 UTC
Permalink
Post by BGB
Post by John Ames
On Fri, 19 Jul 2024 23:21:22 GMT
Post by Scott Lurndal
Poor performance, silly filename length limitations.
I dunno, 8.3 is downright spacious compared to a number of actual
mainframe operating systems...
MS-DOS: 8.3
Commodore: 15.0
Apple ProDOS: 16.0
Apple Macintosh: 31.0 (HFS)
Early Unix: 14 (~ N.M where N+M+1 <= 14)
Although file suffixes had no intrinsic meaning
for Unix, and were seldom more than a single
character.
Post by BGB
Whereas TENEX and some others were 6 character.
OS4000: 8 character
VAX/VMS (and others): 6.3
VMS filenames were 17 character orignally, openvms
allows much longer names.
Post by BGB
ISO 9660 30 (variable format, similar to Unix)
UDF: 255
FAT32 and NTFS: 256 (UTF-16)
EXT2/3/4: 256 (UTF-8)
POSIX defines the minimum path length (generally 1024),
but any implementation of POSIX can choose to support
longer filenames; most filesystem are limited to 255
or 256 characters for a path component.
Post by BGB
For most uses, a 32 character limit would probably be fine.
In your use cases, perhaps.
Post by BGB
Basically using free-form names following Unix-like conventions, albeit
with semi-mandatory file extensions more like in Windows land (binaries
typically use '.exe' and '.dll' extensions; however, unlike Unix style
shells, the file extension is not usually given when invoking a command;
and the extension will be inferred when loading the program).
Extensions were, and are, a pile of steaming stuff. They're
completely unnecessary as a component of a filesystem. As
a user-selected convention they're ok (for example, the gcc
driver program selects which language to compile for from
the extension (but it's optional anyway)), but the operating
system knows nothing of extensions.

Some mainframe operating systems encoded the file type in
metadata (Burroughs in the Disk File Header, unix: inode,
apple: resource fork), but that has downsides as well.
BGB
2024-07-22 23:03:29 UTC
Permalink
Post by Scott Lurndal
Post by BGB
Post by John Ames
On Fri, 19 Jul 2024 23:21:22 GMT
Post by Scott Lurndal
Poor performance, silly filename length limitations.
I dunno, 8.3 is downright spacious compared to a number of actual
mainframe operating systems...
MS-DOS: 8.3
Commodore: 15.0
Apple ProDOS: 16.0
Apple Macintosh: 31.0 (HFS)
Early Unix: 14 (~ N.M where N+M+1 <= 14)
Although file suffixes had no intrinsic meaning
for Unix, and were seldom more than a single
character.
There were/are lots of 3 or 4 character file extensions, like ".cpp" or
".html", ...

In Linux, there are lots of multi-part extensions, like ".tar.gz", etc.

Though, I guess in traditional Unix, 1 character was common.
Post by Scott Lurndal
Post by BGB
Whereas TENEX and some others were 6 character.
OS4000: 8 character
VAX/VMS (and others): 6.3
VMS filenames were 17 character orignally, openvms
allows much longer names.
When I was looking at it, VAX/VMS was listed as 6.3, whereas OpenVMS was
longer. Could be wrong, it was a fairly quick/dirty search.
Post by Scott Lurndal
Post by BGB
ISO 9660 30 (variable format, similar to Unix)
UDF: 255
FAT32 and NTFS: 256 (UTF-16)
EXT2/3/4: 256 (UTF-8)
POSIX defines the minimum path length (generally 1024),
but any implementation of POSIX can choose to support
longer filenames; most filesystem are limited to 255
or 256 characters for a path component.
OK.

Windows has a filename limit of 256, but a path-length limit of 260, so
as noted, you can only put a full-length filename into the root
directory, and putting a long-name file in a long-name directory is
likely to run into the limit.

Things like video downloaders seem to limit the first part of the
filename to around 120 characters or so (typically using the video title
as the filename, and truncating it after this point).


But, yeah, 1024 for an overall path limit makes more sense than 260.
For my own project, I had assumed 512, but either way...

Well, excluding AF_UNIX sockets, which as-is will have a 104 character
name limit... Though, this is more because of the layout for
"sockaddr_un" (where "sockaddr_storage" generally supports up to 128
bytes for the total size).

Internally though, the idea isn't that the actual path for these sockets
is used though, but rather they are mashed into a 128-bit hash (where,
internally pretty much everything can be treated as-if it were IPv6).
Post by Scott Lurndal
Post by BGB
For most uses, a 32 character limit would probably be fine.
In your use cases, perhaps.
IME, the vast majority of "normal" files tend to have names shorter than
32 characters.

The video files (within YouTube or similar) seem to primarily use
shorter alphanumeric names, but the video downloaders tend to use the
title as a filename (so may generate longer names...).
Post by Scott Lurndal
Post by BGB
Basically using free-form names following Unix-like conventions, albeit
with semi-mandatory file extensions more like in Windows land (binaries
typically use '.exe' and '.dll' extensions; however, unlike Unix style
shells, the file extension is not usually given when invoking a command;
and the extension will be inferred when loading the program).
Extensions were, and are, a pile of steaming stuff. They're
completely unnecessary as a component of a filesystem. As
a user-selected convention they're ok (for example, the gcc
driver program selects which language to compile for from
the extension (but it's optional anyway)), but the operating
system knows nothing of extensions.
In my case, the filesystem driver and VFS doesn't really know much about
file extensions, but at the level of the shell and program loader, it
knows about extensions.


So, for things like opening files or "readdir()" or similar, it doesn't
care. The VFS doesn't know about LFN's either (rather, these are local
to the FAT driver). Internally, names are normalized to UTF-8 and
treated as case-sensitive (generally normalizing FAT 8.3 names to lower
case).

The handling for generating SFN's from LFN's differs slightly from
WIndows regarding FAT32:
Windows: "Program Name.txt" => "PROGNA~1.TXT"
TestKern: "~HHHHHHH.~~~", where HHH is an hash of the LFN.

Mostly because the "~1" convention requires figuring out which names
already exist and advancing a sequence number (what happens when 10+
conflict?...). Simply hashing the LFN is easier (and, if an LFN exists,
no need to care about the SFN as mostly no one will see it).

It will just use an 8.3 name in cases where the filename matches an 8.3
pattern (and the case can be encoded using WinNT rules).


There may also be some "$META.$$$" files, but these are used internally
by the FS driver and not exposed to programs (but, would be visible if
the drive viewed from Windows). These mostly being part of a hacky
scheme to add additional metadata (along vaguely similar lines to Linux
UMSDOS; just using native VFAT LFN's for the filenames). Unlike UMSDOS
though, in the table is keyed using the SFN rather than the location in
the directory (and is at least slightly less brittle).


With a new filesystem, the filesystem itself would not need to care
about file extensions, just encoding filenames (as a UTF-8 blob).

General idea was a scheme like:
0- 48: 1 entry;
49-100: 2 entry;
101-220: 4 entry.
221-256: 5 entry (though, has space for 280 bytes).

Where, each extended entry adds 60 bytes, but cuts 8 bytes off the
base-name (for the filename hash).
"OverlyLongFileNameThatIsASentance_NeedTOFindMoreToStickOnHere.txt"
Has a base name like:
"OverlyLongFileNameThatIsASentance_NeedT~HHHHHHH"
Where 'H' is the hash of the full name, and cut-off when rebuilding the
name from the LFN entries.



Though, in the case of the program loader, the extension doesn't really
determine how the file is loaded, as the loader itself mostly uses file
magic, eg:
'MZ': PE loader.
'PE': PE loader.
0x7F,'ELF': ELF Loader
'#!': Redirect ("#!pathname\n")

If it appears to be ASCII text, the extension is considered:
".bas": BASIC interpreter.
Else: Shell Script

The shell will have a list of known executable extensions, and when a
command is typed, will look it up in the following pattern:
Check current directory:
Check first for no extension;
Then tries each known executable extension.
Check everything in the PATH environment variable:
Check first for no extension;
Then, try each known extension.
Else, give up.

Once it finds a matching file, it passes it off to the loader (via a
system call). Current strategy involves trying to open each possible
name (if the open succeeds, it is seen as a hit).
Post by Scott Lurndal
Some mainframe operating systems encoded the file type in
metadata (Burroughs in the Disk File Header, unix: inode,
apple: resource fork), but that has downsides as well.
OK.

Metadata is annoying when files are mostly handled on systems that only
have the filename and the contents (as a big blob of bytes).

Though, generally, it is also preferable to have a file magic, such as a
FOURCC right at the start of the file or similar.

...
Scott Lurndal
2024-07-22 23:58:50 UTC
Permalink
Post by BGB
Post by Scott Lurndal
Post by BGB
Post by John Ames
On Fri, 19 Jul 2024 23:21:22 GMT
Post by Scott Lurndal
Poor performance, silly filename length limitations.
I dunno, 8.3 is downright spacious compared to a number of actual
mainframe operating systems...
MS-DOS: 8.3
Commodore: 15.0
Apple ProDOS: 16.0
Apple Macintosh: 31.0 (HFS)
Early Unix: 14 (~ N.M where N+M+1 <= 14)
Although file suffixes had no intrinsic meaning
for Unix, and were seldom more than a single
character.
There were/are lots of 3 or 4 character file extensions, like ".cpp" or
".html", ...
In Linux, there are lots of multi-part extensions, like ".tar.gz", etc.
The point is, they are arbitrary and not required. Tar quite happly
will unpack an archive named archive, without any extension.

We've seen, with windows, that when the operating system
(or the user) trusts the extension to accurately reflect the
content of the file, bad things happen.
Post by BGB
Post by Scott Lurndal
Post by BGB
Whereas TENEX and some others were 6 character.
OS4000: 8 character
VAX/VMS (and others): 6.3
VMS filenames were 17 character orignally, openvms
allows much longer names.
When I was looking at it, VAX/VMS was listed as 6.3, whereas OpenVMS was
longer. Could be wrong, it was a fairly quick/dirty search.
I was a systems programmer on the VAX 11/780 for four years
back in the day. And we had a source license :-)
Post by BGB
But, yeah, 1024 for an overall path limit makes more sense than 260.
For my own project, I had assumed 512, but either way...
As noted, that's the POSIX minimum. Implementations are free
to support more, if properly documented.
Post by BGB
Well, excluding AF_UNIX sockets, which as-is will have a 104 character
name limit... Though, this is more because of the layout for
"sockaddr_un" (where "sockaddr_storage" generally supports up to 128
bytes for the total size).
A different namespace, of course, will have different rules.
Post by BGB
Internally though, the idea isn't that the actual path for these sockets
is used though, but rather they are mashed into a 128-bit hash (where,
internally pretty much everything can be treated as-if it were IPv6).
Post by Scott Lurndal
Post by BGB
For most uses, a 32 character limit would probably be fine.
In your use cases, perhaps.
IME, the vast majority of "normal" files tend to have names shorter than
32 characters.
The video files (within YouTube or similar) seem to primarily use
shorter alphanumeric names, but the video downloaders tend to use the
title as a filename (so may generate longer names...).
There's more to the world than what you see.

<snip>
Post by BGB
So, for things like opening files or "readdir()" or similar, it doesn't
care. The VFS doesn't know about LFN's either (rather, these are local
to the FAT driver). Internally, names are normalized to UTF-8 and
treated as case-sensitive (generally normalizing FAT 8.3 names to lower
case).
FAT? Why on earth?
Post by BGB
The handling for generating SFN's from LFN's differs slightly from
Windows: "Program Name.txt" => "PROGNA~1.TXT"
TestKern: "~HHHHHHH.~~~", where HHH is an hash of the LFN.
None of this should be necessary, and it's inherently broken
from a UI standpoint.
BGB
2024-07-23 04:06:44 UTC
Permalink
Post by Scott Lurndal
Post by BGB
Post by Scott Lurndal
Post by BGB
Post by John Ames
On Fri, 19 Jul 2024 23:21:22 GMT
Post by Scott Lurndal
Poor performance, silly filename length limitations.
I dunno, 8.3 is downright spacious compared to a number of actual
mainframe operating systems...
MS-DOS: 8.3
Commodore: 15.0
Apple ProDOS: 16.0
Apple Macintosh: 31.0 (HFS)
Early Unix: 14 (~ N.M where N+M+1 <= 14)
Although file suffixes had no intrinsic meaning
for Unix, and were seldom more than a single
character.
There were/are lots of 3 or 4 character file extensions, like ".cpp" or
".html", ...
In Linux, there are lots of multi-part extensions, like ".tar.gz", etc.
The point is, they are arbitrary and not required. Tar quite happly
will unpack an archive named archive, without any extension.
We've seen, with windows, that when the operating system
(or the user) trusts the extension to accurately reflect the
content of the file, bad things happen.
Bigger problem I think is that the OS defaults to hiding the file
extensions and many users trust the icon...

So, if they download something with a filename like "SurveyForm.pdf.exe"
with an Acrobat icon, they will assume it is a PDF.
Post by Scott Lurndal
Post by BGB
Post by Scott Lurndal
Post by BGB
Whereas TENEX and some others were 6 character.
OS4000: 8 character
VAX/VMS (and others): 6.3
VMS filenames were 17 character orignally, openvms
allows much longer names.
When I was looking at it, VAX/VMS was listed as 6.3, whereas OpenVMS was
longer. Could be wrong, it was a fairly quick/dirty search.
I was a systems programmer on the VAX 11/780 for four years
back in the day. And we had a source license :-)
OK.

I didn't exist at the time that machine was new...


When my span of existence began, Compaq was making IBM PC clones, and
the NES had already been released. So, some of what information I can
gather is second hand.

Well, and I guess there was also TRON, and the "north american video
game crash" (where apparently they buried a crapload of Atari 2600 E.T.
cartridges in a landfill, ...).

Well, and then I guess Nintendo releasing the NES and Super Mario Bros,
etc. At this point, I existed.


But, like, about the earliest memories I have, are mostly of watching
the "Super Mario" cartoons, and shows like "Captain N" (at a time before
I really started messing with computers, memories from this time are
rather fragmentary).

But, these went away, and were replaced by the "Sonic The Hedgehog"
cartoons, and shows like "ReBoot". I started using computers as Windows
3.x gave way to Windows 95 (was still in elementary school at the time).

Mostly, started using computers around 3rd grade or so; at the time
computers generally running Windows 3.11 or similar (then followed by
Windows 95).

By middle school, the world had mostly moved on to Windows 98, but I was
odd and decided to run Windows NT4 (and by high-school went over to
Windows 2000, with Windows XP then making its appearance, ...).

Well, and also poking around on/off with Linux.


For me though, computers now are not all that much different from what I
had in high-school (in the early 2000s).

Most obvious changes being:
More RAM, bigger HDDs;
Loss of floppy drives and CRT monitors;
No more parallel port;
Going from IDE to SATA;
...

Well, and other changes:
The world went from flip-phones to smartphones;
Tablets appeared, and became semi popular;
Laptops went from being cheap and decent, to expensive and kinda trash.


But, now I am an aging millennial and have arguably not accomplished all
that much with my life.
Post by Scott Lurndal
Post by BGB
But, yeah, 1024 for an overall path limit makes more sense than 260.
For my own project, I had assumed 512, but either way...
As noted, that's the POSIX minimum. Implementations are free
to support more, if properly documented.
Fair enough; could increase the internal limit if needed...
Post by Scott Lurndal
Post by BGB
Well, excluding AF_UNIX sockets, which as-is will have a 104 character
name limit... Though, this is more because of the layout for
"sockaddr_un" (where "sockaddr_storage" generally supports up to 128
bytes for the total size).
A different namespace, of course, will have different rules.
Possible.

Some stuff I read implied that AF_UNIX socket addresses were supposed to
map to files in the VFS, but on current systems (like Linux) this does
not seem to be the case.

So, pretty much any arbitrary string will work, but by convention it is
meant to be a VFS path.
Post by Scott Lurndal
Post by BGB
Internally though, the idea isn't that the actual path for these sockets
is used though, but rather they are mashed into a 128-bit hash (where,
internally pretty much everything can be treated as-if it were IPv6).
Post by Scott Lurndal
Post by BGB
For most uses, a 32 character limit would probably be fine.
In your use cases, perhaps.
IME, the vast majority of "normal" files tend to have names shorter than
32 characters.
The video files (within YouTube or similar) seem to primarily use
shorter alphanumeric names, but the video downloaders tend to use the
title as a filename (so may generate longer names...).
There's more to the world than what you see.
From what I have seen, we have:
Traditional Unix paths, like:
"/usr/local/bin/x86_64-linux-elf-gcc"
Traditional Windows paths:
"C:\Program Files (x86)\Some Program\ProgName.EXE"
Traditional source-code naming conventions;
...

Most tending to, most of the time, leading to file-names shorter than 32
characters.

But, as noted, the main exception is using YouTube video titles as
filenames, but even most of these tend to only rarely exceed 100 characters.


Like, say, a "typical" example (actual file name):
"Raggedy Ann - Andy A Musical Adventure 1977 35mm Ultra HD.mp4"

Which weighs in at 62 characters... Also this movie was kinda odd.

But, yeah, I have watched some older shows / movies as well.

Well, another example, in the form of a video title:
"Rainbow Brite Beginning of Rainbow Land Part 1.mp4"

Dunno, this stuff is probably still on YouTube (goes and checks; yeah,
seems 80s Rainbow Brite is still around... I found the show enjoyable at
least).


Well, and I guess technically, if someone wanted, they could go and
binge watch all of "H.R. Pufnstuf" on YouTube, ... But, like, meh.

Well, and/or "He-Man and the Masters of the Universe" (which is at least
kinda amusing at times).


But, decided mostly to not go into writing about a bunch of old TV shows
and similar.

...
Post by Scott Lurndal
<snip>
Post by BGB
So, for things like opening files or "readdir()" or similar, it doesn't
care. The VFS doesn't know about LFN's either (rather, these are local
to the FAT driver). Internally, names are normalized to UTF-8 and
treated as case-sensitive (generally normalizing FAT 8.3 names to lower
case).
FAT? Why on earth?
Because:
I am mostly doing development from Windows;
The only filesystems that Windows natively supports on SDcards are
FAT32, NTFS, and exFAT.

If I were developing on a Linux system, I would probably have jumped
ship over to EXT2 or similar.


Comparably, neither UFS2 or MINIX-FS are particularly compelling either.
MINIX filesystem is limited;
UFS/UFS2 is crufty and weird.

Most of the other "modern" filesystems are fairly complicated (more
focused on performance and reliability on high-end systems, rather than
being designed for a resource-constrained system running from an SDcard).



Like, say, if I can run the filesystem with less LOC than what I already
need for FAT32, and little memory overhead beyond what is needed for a
block-cache and dirent cache and similar, this is good.

So, say, it needs to under 2.5 kLOC and preferably have less than 128K
of required memory overhead (say, allowing 64K for the block-cache).


My recent experimental filesystem currently weighs in around 1.0 kLOC
(but, would be reduced a bit if read-only; around 400 LOC).

Memory reservation is currently:
~ 64K block-cache (128 sectors, 16x 4K blocks);
~ 32K inode cache (64 inodes);
~ 8K dirent cache (32 extended dirents);
~ 0.5K (superblock header).
...


This is less than currently needed for my FAT32 driver, mostly because
it needs to support 32K clusters (16x 32K = 512K). Granted, a case could
have been made for smaller block caching rather than per-cluster (I
probably would have approached caching differently had I done it now).

Though, for a read/write filesystem, 4K is a sensible block-size as this
would match the internal block size in typical SDcards (say, they expose
512B sectors to a region of SLC NAND flash, which is then backed in 4K
blocks or similar to a region of QLC NAND flash).
Post by Scott Lurndal
Post by BGB
The handling for generating SFN's from LFN's differs slightly from
Windows: "Program Name.txt" => "PROGNA~1.TXT"
TestKern: "~HHHHHHH.~~~", where HHH is an hash of the LFN.
None of this should be necessary, and it's inherently broken
from a UI standpoint.
This part of the process is mostly buried inside the FAT driver in my
case (unlike on Windows 9x, where one could see it alongside the long
filename).

Though, it seems like Windows 10 no longer exposes the shortname
directly (and short names visible via the Win32 API seem to be synthetic).
Paul Edwards
2024-08-20 20:31:16 UTC
Permalink
Post by BGB
But, now I am an aging millennial and have arguably not accomplished all
that much with my life.
Didn't you email me decades ago to get some changes implemented
to PDPCLIB and you mentioned you were writing a phenomenal
number of lines of code per day? Where did all that effort go?

Regardless, what sort of thing would you consider to be
"accomplished a significant amount"? You're not going to
single-handedly reproduce Windows 11. So if that is the
bar, no-one at all has accomplished much. It's even difficult
to credit Windows itself. Who are you going to credit?
Tim Paterson? Or Bill Gates's father's (or was it his mother's?)
money?

Note that I am not dismissing Bill Gates's technical achievements
with Microsoft BASIC, but that's not Windows 11 by a very
very very long shot.

BFN. Paul.
BGB
2024-08-28 07:28:14 UTC
Permalink
Post by Paul Edwards
Post by BGB
But, now I am an aging millennial and have arguably not accomplished all
that much with my life.
Didn't you email me decades ago to get some changes implemented
to PDPCLIB and you mentioned you were writing a phenomenal
number of lines of code per day? Where did all that effort go?
FWIW:

I ended up with a 3D engine, which was around a 1 MLOC, sort of like
Minecraft with a Doom3 style renderer. No one cared, performance wasn't
so good (was painfully laggy), and this project fizzled.

Part of the poor performance was the use of a conservative garbage
collector, and rampant memory leaks, ... Another part was was "Minecraft
style terrain rendering and stencil shadows don't mix well". Though, for
small light sources, could subset the scene geometry mostly to a
bounding-box around the light source.

But, the sun, well, the sun was kinda evil. Did later move to
shadow-maps for the sun though (though, IIRC, did RGB shadow maps to
allow for colored shadows through colored glass).


Then I wrote a new 3D engine ground-up, which was smaller and had better
performance. Few people cared, I lost motivation, and eventually it
fizzled as well. Was roughly around 0.5 MLOC, IIRC.

It had replaced the complex dynamic lighting with the use of
vertex-color lighting (with a single big rendering pass).


I started on my CPU ISA project, which checking, is around 2 MLOC (for
the C parts.

It is ~ 3.8 MLOC total, if one includes a lot of ASM and C++ code; but a
fair chunk of this is auto-generated (Verilator output, or debug ASM
output from my compiler).

There is also around 0.8 MLOC of Verilog in my project; but this drops
to 200 kLOC if only counting the current CPU core.




Ironically, the OS for my current ISA project has reused some parts from
my past 3D engine projects.

In the course of all this, ended up doing roughly 3 separate
re-implementations of the OpenGL API (the 3rd version was written to try
to leverage special features of my ISA; though was originally written to
assume a plain software renderer, and since implementing a ).

In my current project, I have ports of GLQuake and Quake 3 Arena working
on it; though performance isn't good on a 50MHz CPU.


Ironically, parts of PDPCLIB still remain as a core part of the "OS",
though I had ended up rewriting a fair chunk of it to better fit my
use-case (the "string.c" and "math.c" stuff ended up almost entirely
rewritten, though a fair chunk of "stdio.c" and similar remains intact).
It was also expanded out to cover much of C99 and parts of C11 and C23.

Some wonky modifications were made to support DLLs, which ended up
working in an unusual way in my case:
The main binary essentially exports a COM interface to its C library;
Most of the loaded DLLs have ended up importing this COM interface,
which provides things like malloc/free, stdio backend stuff, ...


It also has a small makeshift GUI, though mostly just displays a shell
window that can be used to launch programs.


Besides my own ISA, my CPU core also runs RISC-V.

There is a possible TODO effort of trying to implement the Linux syscall
interface for RISC-V Mode, which could potentially allow me to run
binaries letting GCC use the "native" GLIBC, which could make porting
software to it easier (vs the hassle of getting GCC to use my own
runtime libraries; or trying to get programs to build using my own
compiler as a cross-compiler).


Though, I did more or less get my compiler to pretend to be GCC well
enough that for small programs, it is possible to trick "./configure"
scripts to use it as a cross compiler (doesn't scale very well, as apart
from some core POSIX libraries, most anything else is absent).

Where, for my own ISA, I am using BGBCC.
BGBCC is ~ 250 kLOC, and mostly compiles C;
Also compiles BGBScript, which sorta resembles ActionScript;
And, BGBScript2, which sorta resembles Java mixed with C#;
Albeit, unlike Java and C#, it uses manual and zone allocation.
Technically could be mixed with C, all using the same ABI;
Also an EC++ like subset of C++.
But, kinda moot as no "Modern C++" stuff has any hope of working.
But, for my current uses, C is dominant.
It is sorta wonky in that it does not use traditional object files.
It compiles into a stack-oriented bytecode and "links" from this.
The bytecode IR could be loosely compared with MSIL / CIL.
ASM code is preprocessed and forwarded as text blobs.
The backend then produces the final PE/COFF images.
Though, this mutated some as well:
Lacks MZ stub / header;
PE image is typically LZ4 compressed.
LZ4 compression makes the loading process faster.
Resource section was replaced with a WAD2 variant.
Made more sense to me than the original PE/COFF resource section.
Compiler also has a built-in format converter.
Say, to convert TGA or PNG into BMP (*1), ...

*1: General resource-section formats:
Graphics:
BMP, 4/8/16/24/32 bit.
Ye Olde standard BMP.
For 16 and 256 color, fixed palettes are used.
BMPA, 4/8 bit with a transparent color.
Basically standard, but with a transparent color.
Generally, the High-Intensity Magenta is transparent.
Or, #FF55FF (or, Color 13 in the 16-color palette)
BMP+CRAM: 2 bpp 256-color, image encoded as 8-bit CRAM.
Supports transparency in a limited form:
Only 1 non-transparent color per 4x4 block,
vs 2 colors for opaque blocks.
QOI: An image in the QOI format (lossless)
LCIF: Resembles a QOI/CRAM hybrid, lossy low/intermediate quality.
Though, BMP+CRAM is faster and has lower overhead.
UPIC: Resembles a Rice-coded JPEG
Optimized for a small low-memory-overhead decoder.
Lossy or Lossless, higher quality, but comparably slow.
Audio:
WAV, mostly PCM, A-Law, or ADPCM.


BGBCC originally started as a fork off of my BGBScript VM, which was
used as the main scripting language in my first 3D engine.

By the 2nd 3D engine, it had partly been replaced by a VM running my
(then) newer BGBScript2 language, with the engine written as a mix of C
and BGBScript2.

While I could technically use BGBScript2 in my TestKern OS, it is almost
entirely C, only really using BGBScript2 for some small test cases (it
is technically possible to use both BS and BS2 in kernel and bare-metal
contexts; and there is partial ISA level assistance for things like
tagged pointers and dynamic type-checking). Where, BS2 retains (from BS,
and its JS/AS ancestors) the ability to use optional dynamic types and
ex-nihilo objects (also BGBCC technically allows doing so in C as well,
with some non-standard syntax, but doing so is "kinda cursed").

Ironically, I am using a memory protection scheme in my ISA based on
performing ACL checks on memory pages. The basic idea for this scheme
was carried over from my original BGBScript VM (where it was applied per
object), where the idea for the scheme was (ironically) inspired partly
by how object security was passed off in the "Tron 2.0" game (in
context, as a more convoluted way of passing off the use of keycards for
doors). But, I was left thinking at the time that the idea actually
sorta made sense. But, in its present form mostly involves applying
filesystem-style checks to pages (the MMU remembers this, but raises an
exception whenever it needs the OS to sort out whether a given key can
access a given ACL).

Well, and also the use of pointers in my ISA with a 48-bit address, and
16 bits of tag metadata in the high order bits, was also itself partly a
carry-over from my Script VMs.



Near the end of my 2nd 3D engine (before the project fizzled out
entirely): It also gained the ability to load BJX2 images into the 3D
engine. In the effect, the 3D engine would itself take on a role like an
OS, effectively running logical processes inside the VM (though, there
wasn't really an API to glue these into the game world).

IIRC, the idea I think was to make the "game server" able to run
programs OS style, which could then run parts of the game logic (rather
than necessarily using my BGBSCript2 language running in a VM;
potentially the BS2 VM code could be ported to BGBCC and run inside the
BJX2 VM). Though, potentially, one could also make a case for using
RISC-V ELF images. Wouldn't necessarily want to run native x86-64 code
as it would be desirable to be able to sandbox the programs inside of a
VM. In such a case, the idea would be that things like voxel
manipulation or interaction with world entities could be via COM
objects, or potentially game objects could signal events into the script
programs.


Can note that my BJX2 project was preceded by BJX1, where BJX1 started
out as a modified version of the Hitachi SH-4 ISA (most popularly used
in the SEGA Dreamcast). I had revived BGBCC initially as I needed a
compiler to target BJX1 (and SH-4). As BJX1 turned into a horrible mess
(turned 64-bit, and fragmented into multiple variants), I eventually did
a "partial reboot".

At the ASM level, initially BJX2 was very similar to BJX1, mostly
carrying over the same ASM and ABI, but with minor changes (and gaining
some features and notation inspired by the TMS320). The BJX2 ISA mutated
over time, and has since also fragmented to some extent (and its current
form also has some similarities to SH-5).

It has since drifted towards being more like RISC-V in some areas,
mostly because my CPU core can now also run RISC-V code (and, if RISC-V
needs a feature, and it is useful, may as well also have it in my own ISA).

ASM syntax/style mostly borrowed from SH-4, which seems to be in a
similar category that also includes the likes of MSP430, M68K, PDP-11,
and VAX. Well, as opposed to RISC-V using a more MIPS-like style.



I also don't really have a "proper" userland as of yet, more the kernel,
shell, and most basic programs, all exist as a single binary (so, say,
if you type "ls", the shell handles it itself; with shell instances as
kernel mode threads).

Any "actual" programs are loaded and then spawned as a new process.

Only recently-ish added the ability to redirect IO, but still doesn't
support piping IO between programs.
Supports basic shell-scripts, but lacks most more advanced shell
features (non-trivial Bash scripts will not work).


There was a 3rd 3D engine of mine, mostly because my 2nd 3D engine would
have still been too heavyweight to run on my CPU core (tried to write
something Minecraft-like that would run in a similar memory footprint to
Quake and was fast enough to be tolerable on a 50MHz CPU).

Between the engines:
Chunk Size: 16x16x16 in both engines;
Region Size: 16x16x16 in 2nd engine, 8x8x8 in 3rd.
32x32x8 in first engine.
Thus, in 3rd engine, each region was a 128x128x128 meter cube.
Block Storage:
1st engine: 8 bit index or unpacked;
2nd engine: 4/8/12 bit index into table of blocks;
3rd engine: 4/8 bit index into block table, or unpacked block array.
World Size:
1st engine: Planar
2nd engine: 1024km (1048576 meters), world wraps on edge
3rd engine: 64km (65536 meters), world wraps on edge
Rendering:
1st engine: global vertex arrays, filled from each chunk
2nd engine: Per-chunk vertex arrays
3rd engine: Raycast, visible blocks drawn into global vertex arrays.
Chunk Storage:
1st engine: RLEW (same format as used for maps in Wolf3D and ROTT)
2nd engine: LZ77 + AdRiceSTF
3rd engine: RP2 (similar to LZ4)
Graphics storage:
1st engine: JPEG (modified to support Alpha channel)
2nd engine: BMP + BTIC4B (8x8 Color-Cell, AdRice Bitstream)
3rd engine: DDS (DXT1)
VFS File Storage:
1st engine: ZIP
2nd engine: BTPAK (Hierarchical Central Directory, Deflate)
Large files broken up into 1MB fragments;
3rd engine: WAD4 (Hierarchical Central Directory, RP2)
Large files broken into 128K fragments.
Audio:
1st engine: WAV (PCM)
2nd & 3rd engine: WAV (IMA ADPCM)

Both 2nd and 3rd engine used the same block-type numbers and the same
texture atlas.

Both 2nd and 3rd engine had used mostly sprite graphics, for my first 3D
engine, I had used 3D models (and skeletal animation), but this was a
lot of effort.

I then noted that sprite graphics still worked well in Doom, and
attempted to mimic the use of sprites, though generally using 4 angles
rather than 8, as 4 was easier to draw. Also using a trick as seen in
some old RPG's where one could pull off idle animations and walking by
horizontally flipping the sprite on a timer.

Initial goal (before the 2nd engine effort fizzled out) was to try to
build something like Undertale, but this was more effort, and I was
lacking a good system for dialog and managing game-event dependency trees.

My 3rd engine never got much past "try to make something that works on
my ISA and can fit in under 40-60MB of RAM".

One minor difference was for live entity serialization within regions,
where my 2nd engine had mostly embedded data within ASCII strings,
whereas the 3rd engine had used binary-serialized XML blobs (reusing the
XML code from BGBCC, where for better or worse; BGBCC had uses XML DOM
style ASTs; but reworked to be a lot more efficient than the original DOM).



Also, some amount of specialized image and video codecs, etc.
There is an experimental video player and Mod/S3M player, though not at
present generalized enough to be usable as media players (would need
some level of UI for this; thus far these load a hard-coded file and
just play it in a loop).

And, some on/off fiddling with things like Neural Nets, etc.


Recently I wrote a tool to import UFO / GLIF fonts and convert them to
an custom font format (mostly as the actual TTF format seemed needlessly
complicated). This was along with the code to render this style of font.
Unclear if it will replace the use of bitmap fonts and SDFs.

Where:
Bitmap font:
Specialized for specific glyph sizes;
Looks good at that size;
Don't really scale.
SDF font:
Scalable (works best for medium glyphs);
Relatively cheap in a computational sense;
But, relatively bulky and eat a lot of memory.
I stored them mostly as 8bpp BMP images, 4-bit X/Y.
Where, each 16x16 glyph page is a 256256 BMP,
Variable / Geometric Font:
Scalable (but works best for large glyphs);
Attempts to draw small glyphs give poor results ATM.
Currently need to draw at 4x final size then downsample.
Higher per-pixel cost;
Less memory needed to hold font;
Can be used to generate SDF's or triangles.



For my ISA / OS project, the fonts and some other things had been
carried over from my 3D engine projects. Well, along with a lot of the
VFS and memory management code (wasn't too much effort to adapt my 3D
engine VFS code to work as an OS VFS).

Main practical difference being that, for an OS VFS, it has a FAT32
driver and similar.



But, I have slowed down in recent years I suspect.
Post by Paul Edwards
Regardless, what sort of thing would you consider to be
"accomplished a significant amount"? You're not going to
single-handedly reproduce Windows 11. So if that is the
bar, no-one at all has accomplished much. It's even difficult
to credit Windows itself. Who are you going to credit?
Tim Paterson? Or Bill Gates's father's (or was it his mother's?)
money?
Note that I am not dismissing Bill Gates's technical achievements
with Microsoft BASIC, but that's not Windows 11 by a very
very very long shot.
Dunno...

Just is seems like a lot of other people are getting lots of
recognition, seem to be doing well off financially, etc.

Meanwhile, I just sort of end up poking at stuff, and implementing
stuff, and it seems like regardless of what I do, no one gives a crap,
or like I am little better off than had I done nothing at all...
Post by Paul Edwards
BFN. Paul.
Paul Edwards
2024-08-28 08:54:20 UTC
Permalink
Post by BGB
Just is seems like a lot of other people are getting lots of
recognition, seem to be doing well off financially, etc.
Meanwhile, I just sort of end up poking at stuff, and implementing
stuff, and it seems like regardless of what I do, no one gives a crap,
or like I am little better off than had I done nothing at all...
Did you consider asking anyone at all if they were after
something?
Post by BGB
Where, for my own ISA, I am using BGBCC.
BGBCC is ~ 250 kLOC, and mostly compiles C;
We have struggled and struggled and struggled to try to get
a public domain C90 compiler written in C90 to produce 386
assembler.

There have been a large number of talented people who tried
to do this and fell flat on their face. I never even tried.

The closest we have is SubC.

Is this a market gap you are able and interested in filling?

By either modifying BGBCC (and making public domain if
it isn't already), or using your skills to put SubC over the line?

I can only guarantee that I will recognize your work if you do
this, but that's potentially better than no-one at all. Also, there
is likely to be more than just me who appreciate having a C90
compiler in the public domain.

We currently use the copyrighted GCC 3.2.3 (my modification
of it) in order to get full C90.

There are some other targets of interest besides 386, namely
370, ARM32, x64, 68000. ARM64 would be good too, but
we don't have that at all.

8086 is another target of interest. SubC is already being used
to produce a bootloader for PDOS/386, but Watcom is better
because of SubC's primitive nature.

Thanks. Paul.
Paul Edwards
2024-08-28 08:58:27 UTC
Permalink
Linas Vepstas was kind enough to assist in debugging
binutils i370 and now z/PDOS-generic has a GCC that
is able to do an optimized compile without crashing.

It is also able to make directories.

This is all EBCDIC.

https://pdos.org/zpg.zip

BFN. Paul.
BGB
2024-08-28 23:03:44 UTC
Permalink
Post by Paul Edwards
Post by BGB
Just is seems like a lot of other people are getting lots of
recognition, seem to be doing well off financially, etc.
Meanwhile, I just sort of end up poking at stuff, and implementing
stuff, and it seems like regardless of what I do, no one gives a crap,
or like I am little better off than had I done nothing at all...
Did you consider asking anyone at all if they were after
something?
I mostly just did stuff, occasionally posting about it on Usenet,
occasionally on Twitter (now known as X...).


For my 3D engines, I posted stuff about them on YouTube; relatively
little feedback, in the time of the first 3D engine, was mostly people
complaining about "ugly graphics" and "looks like Minecraft" (which was
sorta the thing).

The 2nd engine looked even more like Minecraft, apart from also taking
minor influences from things like Undertale and Homestuck (but,
generally, was closer to Minecraft than Undertale; apart from the use of
billboard sprites for things like NPCs).


The 3rd engine had some particularly awful sprites, mostly because:
The 2nd engine sprites were generally fairly high res;
For the 3rd engine I just quickly drew some stuff and called it good;
But, the 3rd engine was more meant as a technical proof of concept than
an actual game.

Arguably, I could have tried to "lean into it", maybe do characters as
32x64 pixel art style (with nearest sampling), but didn't bother.

Terrain generation algorithms:
1st engine had used Perlin Noise.
2nd engine had just used X/Y/Z hashing functions and interpolation.
3rd engine, basically same as 2nd engine.

Hash functions generally being better behaved than Perlin Noise. Though,
some care is needed, as poor hashing may lead to obvious repeating patterns.



Eventually, I mostly gave up on gamedev, as I couldn't seem to come up
with anything that anyone seemed to care about, and my own motivation in
these areas had largely dried up (and most of the time, I ended up being
more motivated to fiddle with technical stuff, than to really do much in
artistic/creative directions; as "artistic creativity" seems to be an
area where I am significantly lacking).
Post by Paul Edwards
Post by BGB
Where, for my own ISA, I am using BGBCC.
BGBCC is ~ 250 kLOC, and mostly compiles C;
We have struggled and struggled and struggled to try to get
a public domain C90 compiler written in C90 to produce 386
assembler.
There have been a large number of talented people who tried
to do this and fell flat on their face. I never even tried.
The closest we have is SubC.
Is this a market gap you are able and interested in filling?
By either modifying BGBCC (and making public domain if
it isn't already), or using your skills to put SubC over the line?
It is MIT licensed, but doesn't currently produce x86 or x86-64 (as I
mostly just used MSVC and GCC for PC based development).

Rather, backends it currently has are:
BJX2
BJX1 and SH-4 (old)
BSR1 (short lived)
Another custom ISA, inspired by SuperH and MSP430.
Very early versions targeted x86 and x86-64.
But, this backend was dropped long ago.
Did briefly attempt a backend for 32-bit ARM, but this was not kept.
This was in a different fork.
Performance of the generated code was quite terrible.
Didn't really seem worth the bother at the time.

Much of the current backend was initially derived from an 'FRBC'
backend, which was an attempt to do a Dalvik style register IR.
The FRBC VM was dropped, as while fast, the VM was very bulky in terms
of code footprint (combinatorial mess). But, at the time, wasn't a big
step to go from a register IR to an actual CPU ISA, and (for a sensibly
designed ISA), it is possible to emulate things at similar speeds to
what one could get with a similar VM.

My current emulator (for BJX2) is kinda slow, but this is more because
it is usually trying to be cycle-accurate, and as long as it is possible
for it to be (on the PC side of things) faster than the CPU core on the
target FPGA, this is good enough...



AFAIK, whether declaring something as public domain is legally
recognized depends on jurisdiction. I think this is why CC0 exists.

Personally, I am not all that likely to bother with going after anyone
who breaks the terms of the MIT license, as it is pretty close to "do
whatever", similar for 3 clause BSD.

It is also more C95 style, making significant use of // comments and
"long long" and similar, more or less the C dialect that MSVC supported
until around 2015 or so (when they started adding C99 stuff).



I had at one point wanted to try to make a smaller / lighter weight C
compiler, but this effort mostly fizzled out (when it started to become
obvious that I wasn't going to be able to pull off a usable C compiler
in less LOC than the Doom engine, which was part of the original design
goal).

I had also wanted to go directly from ASTs to ASM, say:
Preproc -> Parser/AST -> ASM -> OBJ -> Binary
Vs:
Preproc -> Parser/AST -> RIL -> 3AC -> Machine Code -> Binary


But, likely the RIL and 3AC stages are in-fact useful.
And, it now seems like a stack-based IR (for intermediate storage) has
more advantages than either an SSA based IR (like in Clang/LLVM) or
traditional object files (like COFF or ELF). Well, except in terms of
performance and memory overhead (vs COFF or ELF), where in this case the
"linker" needs to do most of the heavy lifting (and needs to have enough
memory to deal with the entire program).

A traditional linker need only deal with compiled machine-code, so is
more a task of shuffling memory around and doing relocs; with the
compiler parts only needing to deal with a single translation unit.
Though, the main "highly memory intensive" part of the process tends to
be parsing and dealing with ASTs, which is gone by the time one is
dealing with a stack bytecode; but, there is still the memory cost of
translating the bytecode into 3AC to actually compile stuff. This
doesn't ask much by modern PC standards, but is asking a lot when RAM is
measured in MB and one wants to be able to run stuff without an MMU (it
is a downside if the compiler tends to use enough RAM as to make virtual
memory essentially mandatory to be able to run the compiler).


But, RIL's design still leaves some things to be desired. As-is, it
mostly exists as big linear blobs of bytecode, and the compiler needs to
deal with the whole thing at once. This mostly works for a compiler, but
would be undesirable for use by a VM (or for a more resource-constrained
compiler, which can't just load everything all at once).

But, efforts to change this have tended to fizzle out.
Post by Paul Edwards
I can only guarantee that I will recognize your work if you do
this, but that's potentially better than no-one at all. Also, there
is likely to be more than just me who appreciate having a C90
compiler in the public domain.
We currently use the copyrighted GCC 3.2.3 (my modification
of it) in order to get full C90.
There are some other targets of interest besides 386, namely
370, ARM32, x64, 68000. ARM64 would be good too, but
we don't have that at all.
8086 is another target of interest. SubC is already being used
to produce a bootloader for PDOS/386, but Watcom is better
because of SubC's primitive nature.
BGBCC doesn't currently support any 16-bit targets, mostly only 32 and
64 (well, and an experimental mode that used 128-bit pointers, but this
was shelved due to "at best, it was gonna suck").
Post by Paul Edwards
Thanks. Paul.
Paul Edwards
2024-08-29 03:14:13 UTC
Permalink
Post by BGB
Post by Paul Edwards
Post by BGB
Where, for my own ISA, I am using BGBCC.
BGBCC is ~ 250 kLOC, and mostly compiles C;
We have struggled and struggled and struggled to try to get
a public domain C90 compiler written in C90 to produce 386
assembler.
There have been a large number of talented people who tried
to do this and fell flat on their face. I never even tried.
The closest we have is SubC.
Is this a market gap you are able and interested in filling?
By either modifying BGBCC (and making public domain if
it isn't already), or using your skills to put SubC over the line?
It is MIT licensed, but doesn't currently produce x86 or x86-64 (as I
mostly just used MSVC and GCC for PC based development).
BJX2
BJX1 and SH-4 (old)
BSR1 (short lived)
Another custom ISA, inspired by SuperH and MSP430.
Very early versions targeted x86 and x86-64.
But, this backend was dropped long ago.
Did briefly attempt a backend for 32-bit ARM, but this was not kept.
This was in a different fork.
Performance of the generated code was quite terrible.
Didn't really seem worth the bother at the time.
Much of the current backend was initially derived from an 'FRBC'
backend, which was an attempt to do a Dalvik style register IR.
The FRBC VM was dropped, as while fast, the VM was very bulky in terms
of code footprint (combinatorial mess). But, at the time, wasn't a big
step to go from a register IR to an actual CPU ISA, and (for a sensibly
designed ISA), it is possible to emulate things at similar speeds to
what one could get with a similar VM.
My current emulator (for BJX2) is kinda slow, but this is more because
it is usually trying to be cycle-accurate, and as long as it is possible
for it to be (on the PC side of things) faster than the CPU core on the
target FPGA, this is good enough...
AFAIK, whether declaring something as public domain is legally
recognized depends on jurisdiction. I think this is why CC0 exists.
And if you believe that, then you're welcome to say that this
is public domain, but you may follow the CC0 license instead
if you wish.
Post by BGB
Personally, I am not all that likely to bother with going after anyone
who breaks the terms of the MIT license, as it is pretty close to "do
whatever", similar for 3 clause BSD.
We're not after someone who is allegedly "not going to go
after anyone", we're after some code that is NOT OWNED
by the original author because he/she has RELEASED IT
TO THE PUBLIC DOMAIN.

If the answer is "no", then please say "no".
Post by BGB
It is also more C95 style, making significant use of // comments and
"long long" and similar, more or less the C dialect that MSVC supported
until around 2015 or so (when they started adding C99 stuff).
Actually, so long as it handles C90 syntax, this would be
a step up from what we currently have.
Post by BGB
I had at one point wanted to try to make a smaller / lighter weight C
compiler, but this effort mostly fizzled out (when it started to become
obvious that I wasn't going to be able to pull off a usable C compiler
in less LOC than the Doom engine, which was part of the original design
goal).
We don't necessarily need a lighter weight compiler. That
could be done at a later date. The first thing we need is
something that will take C90 syntax.
Post by BGB
Preproc -> Parser/AST -> ASM -> OBJ -> Binary
Preproc -> Parser/AST -> RIL -> 3AC -> Machine Code -> Binary
But, likely the RIL and 3AC stages are in-fact useful.
And, it now seems like a stack-based IR (for intermediate storage) has
more advantages than either an SSA based IR (like in Clang/LLVM) or
traditional object files (like COFF or ELF). Well, except in terms of
performance and memory overhead (vs COFF or ELF), where in this case the
"linker" needs to do most of the heavy lifting (and needs to have enough
memory to deal with the entire program).
A traditional linker need only deal with compiled machine-code, so is
more a task of shuffling memory around and doing relocs; with the
compiler parts only needing to deal with a single translation unit.
Though, the main "highly memory intensive" part of the process tends to
be parsing and dealing with ASTs, which is gone by the time one is
dealing with a stack bytecode; but, there is still the memory cost of
translating the bytecode into 3AC to actually compile stuff. This
doesn't ask much by modern PC standards, but is asking a lot when RAM is
measured in MB and one wants to be able to run stuff without an MMU (it
is a downside if the compiler tends to use enough RAM as to make virtual
memory essentially mandatory to be able to run the compiler).
But, RIL's design still leaves some things to be desired. As-is, it
mostly exists as big linear blobs of bytecode, and the compiler needs to
deal with the whole thing at once. This mostly works for a compiler, but
would be undesirable for use by a VM (or for a more resource-constrained
compiler, which can't just load everything all at once).
But, efforts to change this have tended to fizzle out.
We don't need the world's best C compiler. At least not
as a first step.
Post by BGB
Post by Paul Edwards
I can only guarantee that I will recognize your work if you do
this, but that's potentially better than no-one at all. Also, there
is likely to be more than just me who appreciate having a C90
compiler in the public domain.
We currently use the copyrighted GCC 3.2.3 (my modification
of it) in order to get full C90.
There are some other targets of interest besides 386, namely
370, ARM32, x64, 68000. ARM64 would be good too, but
we don't have that at all.
8086 is another target of interest. SubC is already being used
to produce a bootloader for PDOS/386, but Watcom is better
because of SubC's primitive nature.
BGBCC doesn't currently support any 16-bit targets, mostly only 32 and
64 (well, and an experimental mode that used 128-bit pointers, but this
was shelved due to "at best, it was gonna suck").
32 and 64 would be a fantastic start and 99% of the problem..

But if the answer is "no", the answer is "no".

So far the answer is an implied "no".

BFN. Paul.
George Neuner
2024-08-30 10:49:49 UTC
Permalink
On Thu, 29 Aug 2024 11:14:13 +0800, "Paul Edwards"
Post by Paul Edwards
Post by BGB
AFAIK, whether declaring something as public domain is legally
recognized depends on jurisdiction. I think this is why CC0 exists.
And if you believe that, then you're welcome to say that this
is public domain, but you may follow the CC0 license instead
if you wish.
BGB is correct: not all countries recognize the notion of "public
domain".

In WIPO convention countries it generally is possible to release a
work under a license that explicitly grants all rights, but the result
is not quite the same as placing the work in public domain. Without a
legal notion of "public domain" it is not possible for an author to
give up the rights afforded by the (automatic) Berne convention
copyright.

[Of course every country is a WIPO or Berne signatory ... but most
recognize one or both conventions.]

So if you really want a work to be freely usable anywhere in the
world, you can declare it as "public domain" for those countries that
recognize that notion ... but for everywhere else you have to provide
an alternative license that explicitly grants all rights.
George Neuner
2024-08-30 14:27:09 UTC
Permalink
On Fri, 30 Aug 2024 06:49:49 -0400, George Neuner
Post by George Neuner
On Thu, 29 Aug 2024 11:14:13 +0800, "Paul Edwards"
Post by Paul Edwards
Post by BGB
AFAIK, whether declaring something as public domain is legally
recognized depends on jurisdiction. I think this is why CC0 exists.
And if you believe that, then you're welcome to say that this
is public domain, but you may follow the CC0 license instead
if you wish.
BGB is correct: not all countries recognize the notion of "public
domain".
In WIPO convention countries it generally is possible to release a
work under a license that explicitly grants all rights, but the result
is not quite the same as placing the work in public domain. Without a
legal notion of "public domain" it is not possible for an author to
give up the rights afforded by the (automatic) Berne convention
copyright.
[Of course every country is a WIPO or Berne signatory ... but most
^ not
Post by George Neuner
recognize one or both conventions.]
So if you really want a work to be freely usable anywhere in the
world, you can declare it as "public domain" for those countries that
recognize that notion ... but for everywhere else you have to provide
an alternative license that explicitly grants all rights.
Sorry, should have been "... not every country ..."
Paul Edwards
2024-08-31 02:21:53 UTC
Permalink
Post by George Neuner
On Thu, 29 Aug 2024 11:14:13 +0800, "Paul Edwards"
Post by Paul Edwards
Post by BGB
AFAIK, whether declaring something as public domain is legally
recognized depends on jurisdiction. I think this is why CC0 exists.
And if you believe that, then you're welcome to say that this
is public domain, but you may follow the CC0 license instead
if you wish.
BGB is correct: not all countries recognize the notion of "public
domain".
In WIPO convention countries it generally is possible to release a
work under a license that explicitly grants all rights, but the result
is not quite the same as placing the work in public domain. Without a
legal notion of "public domain" it is not possible for an author to
give up the rights afforded by the (automatic) Berne convention
copyright.
[Of course [not] every country is a WIPO or Berne signatory ... but most
recognize one or both conventions.]
So if you really want a work to be freely usable anywhere in the
world, you can declare it as "public domain" for those countries that
recognize that notion ... but for everywhere else you have to provide
an alternative license that explicitly grants all rights.
Isn't that what I just said?

Release it as public domain but say you can use CC0 if you prefer.


BFN. Paul.
George Neuner
2024-08-31 19:30:03 UTC
Permalink
On Sat, 31 Aug 2024 10:21:53 +0800, "Paul Edwards"
Post by Paul Edwards
Post by George Neuner
On Thu, 29 Aug 2024 11:14:13 +0800, "Paul Edwards"
Post by Paul Edwards
Post by BGB
AFAIK, whether declaring something as public domain is legally
recognized depends on jurisdiction. I think this is why CC0 exists.
And if you believe that, then you're welcome to say that this
is public domain, but you may follow the CC0 license instead
if you wish.
BGB is correct: not all countries recognize the notion of "public
domain".
In WIPO convention countries it generally is possible to release a
work under a license that explicitly grants all rights, but the result
is not quite the same as placing the work in public domain. Without a
legal notion of "public domain" it is not possible for an author to
give up the rights afforded by the (automatic) Berne convention
copyright.
[Of course [not] every country is a WIPO or Berne signatory ... but most
recognize one or both conventions.]
So if you really want a work to be freely usable anywhere in the
world, you can declare it as "public domain" for those countries that
recognize that notion ... but for everywhere else you have to provide
an alternative license that explicitly grants all rights.
Isn't that what I just said?
Release it as public domain but say you can use CC0 if you prefer.
BFN. Paul.
I apologize for any offense. I only meant to add information for
those who don't know the reason for the discussion.
wolfgang kern
2024-08-30 11:29:58 UTC
Permalink
On 29/08/2024 01:03, BGB wrote:
...
Post by BGB
Just is seems like a lot of other people are getting lots of
recognition, seem to be doing well off financially, etc.
Meanwhile, I just sort of end up poking at stuff, and implementing
stuff, and it seems like regardless of what I do, no one gives a crap,
or like I am little better off than had I done nothing at all...
seems we two entered the OS-arena from opposite entries,
I started to write my OS on a paying clients demand ... :)

it never was a general purpose system, but successful solutions sold my
OS w/o any advertising. so I could deliver >200 individual tailored PCs.
Most earned money came from user desired applications rather than from
OS and hardware [I stopped all hardware production 1997].

all my guaranty and maintenance contracts end this year.
and because I couldn't buy main-boards w/o UEFI&GPT I stopped working
on the OS as well.

just recently I was asked by long time clients to give it a try again,
I'm old, tired and I hate all bloatware BS, I started reading UEFI docs.
and I had to learn [hate that like pest] a bit C to convert this huge
document into technical readable RBIL styled short pages [in progress].
__
wolfgang
Paul Edwards
2024-09-05 23:46:38 UTC
Permalink
Mainframes are too expensive to use them as simple PC from 30 years
ago. And unlike old PC there is almost no programs ready to
run on Paul's system.
When I've "finished", there should be a complete toolchain
and microemacs editor, so any C90 source code should work
(including with embedded ANSI for text fullscreen).
Times have changed and users now want more from their machines.
Times are changing again, and I want more from my users.

I'm looking forward to the day when everyone switches on their
machines and they're all bricked because Intel and AMD had a
drop dead date in their CPUs.

I'll be last man standing in the Philippines with my Zhaoxin CPU,
and of course the mainframes will still be working.

Or something like that. There was a recent bricking of machines
worldwide due to an ACCIDENT at Crowdstrike.

Now what happens when there is a DELIBERATE attack from
someone in (or who has hacked) Microsoft?

I do my development on Windows 2000 - the last version that
didn't need authentication - and I can run it under Linux on a
Zhaoxin CPU. The Zhaoxin comes with a BIOS (in Chinese -
good grief) - that allows me to run PDOS/386 too.

Last. Man. Standing.

I have my backup plan. Good luck to everyone else.

I charge $1000/minute for programming services, and $1000/minute
for time on my Zhaoxin.

You got PDOS for free though.

BFN. Paul.
J. Curtis
2024-09-06 19:08:20 UTC
Permalink
Post by Paul Edwards
I'll be last man standing in the Philippines with my Zhaoxin CPU,
and of course the mainframes will still be working.
Not without power. Without refrigeration city people won't last long.
Paul Edwards
2024-09-07 00:12:10 UTC
Permalink
Post by J. Curtis
Post by Paul Edwards
I'll be last man standing in the Philippines with my Zhaoxin CPU,
and of course the mainframes will still be working.
Not without power. Without refrigeration city people won't last long.
The power grid is dependent on computers being operational?

Regardless, while I am currently in Ligao City, Albay Province,
where I have a manually pumped water well available for when
the public water is either non-existent or dirty, I normally live
halfway between Ligao and Pio Duran. The house opposite us
slaughters pigs at 2am or something.

The grid electricity goes up and down like a yoyo in both places.

I finally found the right portable solar which can be found by
searching for "solar" at pdos.org and I lived for a couple of
months purely off solar for my computing needs. I was using
a Pinebook Pro rather than the Zhaoxin though. While both
have USB-C to charge, only the Pinebook Pro can definitely
be charged from a powerbank. The Zhaoxin says it is
charging but reality appears to be different and I'm not sure
what the situation is and regardless I was planning on getting
a different powerbank.

Actually - I'll take any advice on that. The solar I referenced
has a PD (power delivery) outlet that would potentially give
me a lot more power, but I need a matching outdoor powerbank
to accept it, and I don't know of anything suitable (an Amazon
reference would be good and hopefully they ship to the
Philippines).

I already have Fidonet technology software theoretically operational
on PdAndro on my Android phone that will allow me to replace
the internet. Ditto on PDOS/386 - it was actually tested there.

Admittedly there aren't a lot of people to talk to here, but
hopefully there will be some western refugees turning up in
small boats to access the last operational computer network.

Oh yeah - we have a manually operated well on our normal
property too - also protected by dwendes rather than trolls -
that's a potentially valuable concept that may have been lost
in the West, although in both places we're not actually
drinking that water. I asked if we could boil it but didn't get
a good answer and it hasn't been priority to push the issue.

The irony is that I was happy to be a city slicker in Sydney,
but being in this new environment made me take an interest
in how basic needs were able to be satisfied, and especially
whether we were dependent on Saudi Arabia. I'm obviously
not expecting further deliveries of solar panels, but I will be
armed with some computing power for some time even
without ALECO.

Theoretically.

BFN. Paul.
Paul Edwards
2025-01-22 06:34:09 UTC
Permalink
Post by Paul Edwards
For 35+ years I have wondered why there was no MSDOS for
the mainframe. I now have an EBCDIC FAT32 file system on
FBA disks on the mainframe and an operating system that can
do basic manipulation, like typing files.
And now I can compile C programs, and gccmvs (3.2.3) can
reproduce itself, byte-exact.

Which is what I normally use to judge integrity.

zpg.zip and herc32.zip from https://pdos.org

BFN. Paul.

J. Curtis
2024-07-19 23:02:30 UTC
Permalink
Post by Scott Lurndal
MS-DOS is, was, and always will be a toy
Small toys and big toys, are all toys.
John Ames
2024-07-19 16:35:33 UTC
Permalink
On Fri, 19 Jul 2024 18:43:13 +0800
Post by Paul Edwards
Sure - but why not make it available anyway? What's the barrier
to someone doing that? No-one is interested? Too much work?
It didn't need to be Microsoft personally. And it can be written
in C to make things easier. Or even some other language - e.g.
CP/M was written in PL/M I think.
Well, 35 yrs. ago, x86 wasn't even a thing in the "mainframe" (by which
we presumably mean "large-scale, heavy-duty business computing") space;
in 1989 e.g. CompuServe was still entirely a PDP-10 shop, IBM had just
rolled out its AS/400 line, and the IA-32 architecture was still four
years away from even going superscalar. Others here have far more
direct knowledge of the "mainframe" space than myself, and can feel
free to correct me, but AFAIK x86 systems didn't see broad acceptance
in truly heavy-duty business computing 'til the mid-'00s.

And while MS-DOS can certainly be used for classic batch processing, it
has practically no support for multitasking, which was already a thing
in the mainframe space all the way back to the '60s, because any given
batch job will not *necessarily* make maximal use of the computer, and
at large scale it makes no sense to leave available resources idle.
It's possible to set specialized utilities running as TSRs in DOS, but
the system as a whole is not designed for more than one "real" program
to run at a time - so sharing the system between large numbers of
individual jobs in a generalized way simply isn't possible.

So, in short: there was no mainframe hardware platform that it could be
ported to back in the day, and it's not well-suited for that use case.
One certainly *could* get it running on, say, a large x86 cluster as a
novelty, but it's not a huge surprise that, thus far, nobody has been
thus inclined.
Paul Edwards
2024-08-20 20:20:24 UTC
Permalink
Post by John Ames
And while MS-DOS can certainly be used for classic batch processing, it
has practically no support for multitasking, which was already a thing
So, in short: there was no mainframe hardware platform that it could be
ported to back in the day, and it's not well-suited for that use case.
One certainly *could* get it running on, say, a large x86 cluster as a
novelty, but it's not a huge surprise that, thus far, nobody has been
thus inclined.
I'm not familiar with "clusters". Could you tell me what this
"novelty" port would look like?

Thanks. Paul.
John Ames
2024-08-21 16:51:01 UTC
Permalink
On Wed, 21 Aug 2024 04:20:24 +0800
Post by Paul Edwards
I'm not familiar with "clusters". Could you tell me what this
"novelty" port would look like?
"Clusters" being "large numbers of discrete systems across which work
is distributed," an idea that goes back at least to the Transputer but
which really took off in high-performance computing in the early '00s
(IIRC,) when commodity PC hardware reached a performance level such that
it was practical to use "a bunch of PCs in a network" as a replacement
for a single high-performance computer of some other flavor, depending
on the job.

So, for the sake of argument, let's say you got MS-DOS running on such
a platform - certainly possible, since it's fundamentally a PC (leaving
aside issues of e.g. real-mode BIOS vs. UEFI or getting packet drivers
for the NIC, as well as getting the network stack going.) You then have
a large number of DOS PCs on which you can run one (1) job at a time.

Now, assuming that you're doing this because you have a large number of
jobs which you'd like to power through in a maximally efficient manner,
you'd also need some kind of supervisory system to distribute jobs from
the pile to individual nodes in the cluster. There's no reason this
couldn't also run on MS-DOS; in any case, you'd need software on both
ends to *A.* schedule the job for a particular node, *B.* provide that
node with access to the files/resources needed to do it, and *C.* keep
it rolling from one job right into the next.

Now you've got things going; but it's a far cry from maximum efficiency,
because the hardware on each node in the cluster is almost certainly
capable of multi-threaded operation, but DOS has no support for multi-
processing at all. (It's also probably a 64-bit system, but we'll say
for the sake of argument that you've got some kind of amd64 equivalent
to a DPMI going - has anyone written one yet? - so that your application
can at least make full-ish use of a single CPU core.)

There's two ways you could go about handling this. You could attempt to
extend your single-threaded MS-DOS application into a multi-threaded
one, handling all the scheduling and resource-contention issues within
itself. At this point, you've more or less implemented a different OS
on top of DOS (like pre-NT Windows.) Not terribly ideal, since you are
(presumably) still handling I/O through DOS and your DOS-based network
stack, which are single-threaded and will bottleneck all the other
threads. (This was something that Amiga programmers used to deal with:
pre-emptive multitasking bolted awkwardly to a single-threaded DOS.)

Alternatively, you could choose to virtualize; modern implementations
of the amd64 architecture support this natively in hardware, so you can
put even your 64-bit DPMI-enabled mutant MS-DOS program in a container
such that it thinks it's running by itself on a single-threaded CPU,
and then run as many of those in parallel as you have CPU threads. Of
course, you'll need a hypervisor system in place for this; for the sake
of argument, you could probably *also* run this on MS-DOS, but I very
much doubt anyone's written such a beast, so you'd probably have to do
it yourself. You might also need to extend your supervisor-node
software to parcel out multiple jobs to each machine, unless all the
hypervisor-guest systems appear as individual nodes on the network
(which they certainly could.)

So, to summarize: all you need in order to accomplish this is 1. a DOS-
based hypervisor which almost certainly doesn't exist, 2. a 64-bit DPMI
extender which probably doesn't, 3. DOS-based remote job execution
tools which might conceivably already exist, but may not, 4. your own
particular mutant 64-bit MS-DOS application, and 5. a task sufficiently
large/intensive to justify all this effort on in the first place.

Should make for a nice weekend project!
Paul Edwards
2024-08-22 05:24:02 UTC
Permalink
Post by John Ames
So, to summarize: all you need in order to accomplish this is
This is a very complicated new system. That is not my goal.
My goal is a simple starter system. z/PDOS-generic is an
example of a simple starter system.
Post by John Ames
1. a DOS-
based hypervisor which almost certainly doesn't exist, 2. a 64-bit DPMI
extender which probably doesn't,
Note that I have 32-bit MSDOS which is accomplished by
switching from PM32 to RM16 in order to make BIOS calls.
This works on an AMD64-like processor if someone has
made a BIOS availlable. I actually bought a Lenovo Kaitian
with a Zhaoxin processor in order to get this. The BIOS is
literally in Chinese and I needed help from a friend in order
to know how to switch between UEFI and legacy BIOS.

Also note that I have a thin wrapper on top of UEFI that switches
a UEFI system into a mini Windows 64-bit clone. That is also
MSDOS-like.

And at this level you need to define wihat "MSDOS" actually means.

BFN. Paul.
Grant Taylor
2024-07-20 00:46:03 UTC
Permalink
Post by Paul Edwards
Sure - but why not make it available anyway? What's the barrier
to someone doing that? No-one is interested? Too much work?
I believe you answered your own question.
Post by Paul Edwards
It didn't need to be Microsoft personally.
Assuming the MS in MS-DOS stands for Microsoft, yes, it does need to be
Microsoft.

If you just want DOS on a mainframe, IBM did that.

Link - DOS/360 and successors - Wikipedia
- https://en.wikipedia.org/wiki/DOS/360_and_successors
--
Grant. . . .
Paul Edwards
2024-08-20 20:18:11 UTC
Permalink
Post by Grant Taylor
Post by Paul Edwards
Sure - but why not make it available anyway? What's the barrier
to someone doing that? No-one is interested? Too much work?
I believe you answered your own question.
Post by Paul Edwards
It didn't need to be Microsoft personally.
Assuming the MS in MS-DOS stands for Microsoft, yes, it does need to be
Microsoft.
If you just want DOS on a mainframe, IBM did that.
Link - DOS/360 and successors - Wikipedia
- https://en.wikipedia.org/wiki/DOS/360_and_successors
And that is really crappy compared to Microsoft's version. The
Microsoft version (or equivalent) could have been used for
debugging system problems, or experimenting on a DR site.

BFN. Paul.
Loading...