MBR and sector size

Discussion:

MBR and sector size

(too old to reply)

muta...@gmail.com

2021-10-30 00:16:01 UTC

I noticed from here:

https://en.wikipedia.org/wiki/Master_boot_record

that the first 512 bytes of the hard disk doesn't store
the sector size. So when you need to get to the first
partition, you have the LBA number, but you don't know
how many bytes you need to seek to.

I guess this means I need to configure my hard disks
in my own BIOS and OS with the sector size.

BFN. Paul.

Grant Taylor

2021-10-30 01:10:23 UTC

Permalink

that the first 512 bytes of the hard disk doesn't store the sector
size.

I thought sector size was a hard drive physical property and not
something that could easily be adjusted. Thus there is effectively no
need to look it up.

As I understand it, there are two vastly predominant sector sizes; the
venerable 512 bytes and the new 4096 bytes (8 x traditional 512 bytes).
All drives that have sectors larger than 512 are multiples of 512 and
have virtual 512 sectors for backwards compatibility.

--
Grant. . . .
unix || die

wolfgang kern

2021-10-30 16:57:59 UTC

Permalink

how to add a dummy sector to an existing HD ?

Oh - some of the things I do is actually speculation on what
would be ideal rather than something I'm planning on doing
at 3pm tomorrow.
So to add a dummy sector to an existing HD, all you need
is firmware loaded that returns with 512 bytes of data that
the HD has no clue about, when LBA 0 or maybe LBA -1 is
requested
If LBA 0 is chosen, the data will need to be unloaded, all
the LBA references have 1 added to them, and then the
HD is reloaded.
If anyone complains, I'll ask Jens to bomb them.

LBA 0 is a valid sector, known and used as MBR
LBA -1 may exist as 2^48-1 in far distant future,
today it will just lockup the drive with ERR.
512*(2^48-1)==144'115'188'075'855'360 bytes
__
wolfgang

Scott Lurndal

2021-10-30 16:58:24 UTC

Permalink

Post by Grant Taylor

that the first 512 bytes of the hard disk doesn't store the sector
size.

I thought sector size was a hard drive physical property and not
something that could easily be adjusted. Thus there is effectively no
need to look it up.

When the hard disk data is extracted into a flat file of
sectors, how do I know how many bytes to skip to go
to a particular LBA address?

For a SCSI drive, the READ CAPACITY command will return the
maximum logical block address and the logical block length in
bytes.

For an ATA/SATA device use the ATA8 Identify Device command
to get the logical sector size (words 117-118, and word 106 bits <15>, <14>, and <12>).

Grant Taylor

2021-10-30 17:00:28 UTC

Permalink

When the hard disk data is extracted into a flat file of sectors,
how do I know how many bytes to skip to go to a particular LBA address?

I don't know the technical answer to that.

But I do know that I've never had to worry about that in 20+ years of
doing a lot of drive / file system imaging.

I either image a file system (contents of a partition) or I image the
entire drive (including partition information). I access the former as
a raw file system. I access the latter via Linux kernel's ability to
see partitions on a loopback and then access the contents of a partition
therein.

The only time that I've needed to access something independently is when
I'm needing to wipe the first sector for some reason. And that's
inherently the first sector. So I don't need to find out where an LBA
address is in an image.

I have an expectation that data can be copied to and from devices
and still be accessible.

My experience is that as long as the source and destinations are the
same type (partition contents / file system -or- whole drive) I don't
have any problems doing this.

I need to either distinguish devices based on sector size or I need
to introduce a dummy sector. If I have a dummy sector I would have
an expectation that the firmware would maintain that sector as if it
was real.

You should be able to extract the information from any standard file
system / partition data in the image. I'd expect a quick analysis /
pattern match on the first 4kB of data would clearly give you
information to know what things are. Either it will have a standard
partition table (indicating it's a whole drive image) or it will have
standard file system information (indicating it's a partition image) or
it will not have either of those (indicating that it's likely an
atypical use) which can probably be accessed as a raw device.

I seriously doubt that you need to create a dummy sector or burn a
sector as a dummy sector. I would bet dollars to doughnuts that you can
look for a different pattern and find it exists in almost all existing
drives / images thereof.

Or maybe I am looking at things incorrectly.

;-)

--
Grant. . . .
unix || die

muta...@gmail.com

2021-10-30 17:30:38 UTC

Permalink

Post by Grant Taylor
I either image a file system (contents of a partition) or I image the
entire drive (including partition information). I access the former as
a raw file system. I access the latter via Linux kernel's ability to
see partitions on a loopback and then access the contents of a partition
therein.

Do you agree that in the latter case, Linux will need to
hardcode the number 512 for that to work?

Post by Grant Taylor
You should be able to extract the information from any standard file
system / partition data in the image. I'd expect a quick analysis /
pattern match on the first 4kB of data would clearly give you
information to know what things are. Either it will have a standard
partition table (indicating it's a whole drive image)

I know it is a whole drive image. I just don't know whether
it is using 512 or 4096 or some other number sectors. I was
expecting this information to be recorded some place in the
MBR, the same as it is in the FAT boot sector.

Once you get to the FAT, everything is cool - the info is there,
but how are you supposed to get there?! Maybe I should change
the MBR format the same as other people did.

BFN. Paul.

Scott Lurndal

2021-10-30 20:18:30 UTC

Permalink

Post by ***@gmail.com

Do you agree that in the latter case, Linux will need to
hardcode the number 512 for that to work?

No, it will use the BLKBSZGET/BLKSZSGET ioctl's to retrieve
the sector size for the device.

Scott Lurndal

2021-10-30 20:22:21 UTC

Permalink

Post by ***@gmail.com
I know it is a whole drive image. I just don't know whether
it is using 512 or 4096 or some other number sectors. I was
expecting this information to be recorded some place in the
MBR, the same as it is in the FAT boot sector.

That's a bit of a chicken and egg problem, in that you really
should know the sector size before reading the device.

Both SCSI and ATA provide a means for the device driver to
determine the sector size of the media (512, 2048, 4096,
527, 180, 100, 384 - I've seen them all in real world disks).

Most modern operating systems intentionally hide the details
of the underlying storage system from the application, for
good reason.

muta...@gmail.com

2021-10-30 23:18:53 UTC

Permalink

I'm now thinking that my pseudo-BIOS which exports fread etc
to the OS, should be using the MVS flavor of the C library, which
is focused on an expectation that the underlying system will
give you a block, and you will be told the size of that block, and
the underlying system does not provide a facility to keep track of
a byte offset within that block for you.
As opposed to a C library that can make use of a seek() syscall
that will position on byte offset 3 or whatever in a file for you.
I think that is the concept I have been missing.

Is there existing terminology for this? Block devices versus
character streams perhaps?

Thanks. Paul.

Scott Lurndal

2021-10-31 00:04:35 UTC

Permalink

Post by ***@gmail.com

Is there existing terminology for this? Block devices versus
character streams perhaps?

Congratulations, you've re-invented Unix 'block' and 'character'
devices.

muta...@gmail.com

2021-10-31 01:07:30 UTC

Permalink

Post by Scott Lurndal

Post by ***@gmail.com

Is there existing terminology for this? Block devices versus
character streams perhaps?

Congratulations, you've re-invented Unix 'block' and 'character'
devices.

An ordinary file is considered to be a "device", is it?

Regardless, would you expect a C library to have different paths
for dealing with block devices vs character devices?

And does MVS deal exclusively with block devices?

When I went to implement PDPCLIB for MVS, I already had
a DOS/OS2 version with only minor differences, but for MVS
I knew I needed to deal with "records" and I wasn't sure whether
it could fit in with the code I already had, so I simply put some
#ifdefs and created brand new versions of fread/fgets/etc to
deal with MVS records.

It never occurred to me that the MVS flavor would be relevant
to Unix block devices.

BFN. Paul.

Joe Monk

2021-10-31 15:05:02 UTC

Permalink

"MVS, and the old mainframe paradigm of logical record
sizes, record formats, blocking factors are very thankfully
obsolete."

You should probably take a good look at whatever is in that pipe that youre smoking ... Youre very clearly hallucinating.

https://www.reddit.com/r/mainframe/comments/mgjzs8/mainframes_in_credit_card_processing/

Joe

Rod Pemberton

2021-11-01 07:33:50 UTC

Permalink

On Sun, 31 Oct 2021 08:05:02 -0700 (PDT)

Post by Joe Monk
"MVS, and the old mainframe paradigm of logical record
sizes, record formats, blocking factors are very thankfully
obsolete."
You should probably take a good look at whatever is in that pipe that
youre smoking ... Youre very clearly hallucinating.
https://www.reddit.com/r/mainframe/comments/mgjzs8/mainframes_in_credit_card_processing/

This is a Usenet newsgroup that you're posting to from Google Groups.
(The latter part is an assumption since you use Gmail.)

I.e., those on Usenet use newsreaders to read the thread. As such,
replies are not ordered or threaded the same way as on browser based
Google Groups. In fact, I can select numerous different ordering and
threading combinations in this newsreader. So, when you snip so much
context, as well as the header lines which show to whom you replied to,
we have to go back up through the thread to determine just whom you
replied to. In this case, you were replying to a post which is up six
messages in my newsreader, because other replies were threaded below,
before yours. You were replying to Scott Lurndal, not Paul Edwards.

--
Why does Mr Zuckerberg's metaverse avatar appear as if it's being
strangled by an invisible noose? chest out, shoulders back, ...

muta...@gmail.com

2021-11-01 00:15:34 UTC

Permalink

Post by ***@gmail.com
Regardless, would you expect a C library to have different paths
for dealing with block devices vs character devices?

The entire point of Unix was to treat all files as simple streams
of bytes. The C library didn't give a shit about the file type.
The Operating System ensured that all files (disk, tape, terminal,
card readers, line printers, et alia) appear to the application as
a simple stream of bytes.

I thought the goal was for the application to get a simple
stream of bytes, not necessarily the C library.
For the C library, I think/guess that it should be aware of
whether it is dealing with a block device or a character
device. If it is dealing with a block device, then, for instance,
it shouldn't issue a write() syscall until it has a buffer of
suitable size for that device. Similarly, it should be conscious
when seeking that it is seeking to a block boundary, and
reading should be done of an entire block.

I think I know what was bugging me.

The Unix implementation of the C library doesn't need to
care about blocks - Unix will take care of that.

My situation is that I will have a C library directly doing BIOS
calls (and maybe UEFI calls), not OS calls, so I'm the one that
needs to deal with the block devices, specifically sector reads
and writes, so I need to use the "MVS" flavor of PDPCLIB,
which is incorrectly named "MVS" and should instead have
been called "block".

I think.

BFN. Paul.

Rod Pemberton

2021-11-01 07:14:13 UTC

Permalink

On Sun, 31 Oct 2021 14:07:31 GMT

The C library didn't give a shit about the file type.

Could you clarify? ...

I.e., both fopen() and freopen() require the file type, either text or
binary, for the three modes: read 'r', write 'w', and append 'a'. ANSI
C defaults to opening files as text, unless the file is opened as
binary, either by specifying binary mode by adding a 'b' to the mode,
or by specifying update mode by adding a '+' to the mode.

--
Why does Mr Zuckerberg's metaverse avatar appear as if it's being
strangled by an invisible noose? chest out, shoulders back, ...

Rod Pemberton

2021-11-01 07:14:17 UTC

Permalink

On Sun, 31 Oct 2021 14:07:31 GMT

Post by ***@gmail.com
Regardless, would you expect a C library to have different paths
for dealing with block devices vs character devices?

The entire point of Unix was to treat all files as simple streams
of bytes.

In general, I'd agree with you that Unix "treat[s] all files as simple
streams of bytes" as a matter of Unix design philosophy. I would
actually claim that the concept includes not just files, but also
includes all permanent storage media and memory mapped devices.

But, to say the (emphasis added) "The ENTIRE POINT of Unix" is only to
treat files as streams, is a bit much, at least for me anyway. I.e.,
you a multitude of Unix design philosophies and capabilities into just
one primacy concept.

--
Why does Mr Zuckerberg's metaverse avatar appear as if it's being
strangled by an invisible noose? chest out, shoulders back, ...

Rod Pemberton

2021-11-01 07:21:20 UTC

Permalink

On Mon, 1 Nov 2021 02:14:17 -0500

Post by Rod Pemberton
On Sun, 31 Oct 2021 14:07:31 GMT

Post by ***@gmail.com
Regardless, would you expect a C library to have different paths
for dealing with block devices vs character devices?

The entire point of Unix was to treat all files as simple streams
of bytes.

In general, I'd agree with you that Unix "treat[s] all files as simple
streams of bytes" as a matter of Unix design philosophy.

Oh, I said "in general" because I think the Unix concept was
"Everything is a file," whereas C's concept is that every C object maps
onto a sequence of contiguous C bytes. I.e., it seems as if you've
merged or conflated the two concepts into one.

--
Why does Mr Zuckerberg's metaverse avatar appear as if it's being
strangled by an invisible noose? chest out, shoulders back, ...

Rod Pemberton

2021-11-02 10:31:18 UTC

Permalink

On Mon, 1 Nov 2021 02:14:17 -0500

Post by Rod Pemberton
On Sun, 31 Oct 2021 14:07:31 GMT

Post by ***@gmail.com
Regardless, would you expect a C library to have different paths
for dealing with block devices vs character devices?

The entire point of Unix was to treat all files as simple streams
of bytes.

In general, I'd agree with you that Unix "treat[s] all files as simple
streams of bytes" as a matter of Unix design philosophy. I would
actually claim that the concept includes not just files, but also
includes all permanent storage media and memory mapped devices.
But, to say the (emphasis added) "The ENTIRE POINT of Unix" is only to
treat files as streams, is a bit much, at least for me anyway. I.e.,
you a multitude of Unix design philosophies and capabilities into just
one primacy concept.

you reduced a multitude

--
Why does Mr Zuckerberg's metaverse avatar appear as if it's being
strangled by an invisible noose? chest out, shoulders back, ...

Scott Lurndal

2021-11-01 13:58:49 UTC

Permalink

Post by ***@gmail.com

Post by Scott Lurndal
Congratulations, you've re-invented Unix 'block' and 'character'
devices.

An ordinary file is considered to be a "device", is it?
Regardless, would you expect a C library to have different paths
for dealing with block devices vs character devices?

I thought the goal was for the application to get a simple
stream of bytes, not necessarily the C library.

The "C library" as you call it simply calls the operating
system. The C library sees only a stream of bytes.

MVS, and the old mainframe paradigm of logical record
sizes, record formats, blocking factors are very thankfully
obsolete.

I'm with Joe on that, but regardless, I wish to support this
paradigm AND the Unix paradigm.

The advantage of the Unix paradigm is that the _application_
can build whatever level of records or blocks that it needs
on top of the byte stream. That's why unix has lseek/fseek.

There's no need to have that stuff in the operating system
(or in the case of VAX/VMS, in ring 1 (executive)).

Joe Monk

2021-11-02 11:27:47 UTC

Permalink

Post by Scott Lurndal
There's no need to have that stuff in the operating system
(or in the case of VAX/VMS, in ring 1 (executive)).

Hmmm.... without that stuff in the OS, how does the OS build paging files? Or spool files?

Joe

Joe Monk

2021-11-03 01:46:34 UTC

Permalink

Spool files are an artifact of the unit record world, thankfully also obsolete.

All windows printing is done with spool files, hence the existence of the print spooler.

Joe

Scott Lurndal

2021-11-03 15:04:30 UTC

Permalink

Post by Joe Monk

Spool files are an artifact of the unit record world, thankfully also obsolete.

All windows printing is done with spool files, hence the existence of the print spooler.

And none of the handling for that is in the NTOS kernel. It's in usermode code like
it should be.

Joe Monk

2021-11-06 23:26:17 UTC

Permalink

Post by Joe Monk

Spool files are an artifact of the unit record world, thankfully also obsolete.

All windows printing is done with spool files, hence the existence of the print spooler.

And none of the handling for that is in the NTOS kernel. It's in usermode code like
it should be.

All spool file handling in MVS is in JES2. Its not part of the nucleus either.

Access Methods (BSAM, QSAM, etc.) are not in the nucleus either. The nucleus issues CCWs just like everything else.

Joe

Grant Taylor

2021-11-06 23:52:48 UTC

Permalink

Post by Joe Monk
All spool file handling in MVS is in JES2.

What about JES3?

What about other 3rd party alternatives?

--
Grant. . . .
unix || die

Rod Pemberton

2021-10-31 11:38:30 UTC

Permalink

On Fri, 29 Oct 2021 17:16:01 -0700 (PDT)

Post by ***@gmail.com
https://en.wikipedia.org/wiki/Master_boot_record
that the first 512 bytes of the hard disk doesn't store
the sector size. So when you need to get to the first
partition, you have the LBA number, but you don't know
how many bytes you need to seek to.

According to an article by Seagate linked to by Wikipedia (link below),
LBAs are always 512 bytes. (Is this correct? IDK ...) Apparently,
LBAs are always 512 bytes even for the 4KB block devices. (I guess
this could be true for 4KB block devices if using LBA48.)

"Each 512-byte sector is assigned a unique LBA, from zero (0) to the
number required based on the size of the disk."

Also, according to the same article, OSes and partioning software are
either 4K aware or they aren't. I.e., for 4K aware OS and partitioning
software, LBA 0 should be aligned to 0th byte offset or the 1st 512
block of the 0th 4KB block ("aligned"), instead of the 2nd, 3rd, 4th,
5th, 6th, 7th, or 8th 512 block of the 0th 4KB block ("unaligned"). If
the OS and partition software is 4K aware, then the partioning software
also aligns the partition start to a 4KB boundary. Apparently,
alignment of partitions to non-4K boundaries is to be avoided. If a
partion is unaligned on a 4K device, then it must be re-aligned using
special tools, or deleted and recreated with proper 4K alignment.

There is a table for Windows OSes as to whether or not they're 4K
aware. The article also lists at which kernel version Linux is 4K
aware thereafter.

https://www.seagate.com/tech-insights/advanced-format-4k-sector-hard-drives-master-ti/

In other words, or AIUI, LBAs work as we expect, reading or writing
only 512 byte blocks for both 512 byte or 4KB block devices. And, so,
if the image is 512 byte block based, everything is fine. And, if the
image is 4KB block, the partitioning software should've aligned the
partition start to a 4KB boundary. If the image is 4KB block, and the
partition is unaligned, someone must fix the partition start on the 4KB
device to be 4K aligned, and remake the image.

--
Democrats are trying to tax billionaires but repeal the SALT cap which
benefits the rich, at the same time.