[Libpst-devel] size_t fields

Peter Whittaker pwwnow at gmail.com
Sun Feb 11 15:46:59 CET 2007


On Mon, 2007-01-29 at 20:00 +0000, Chris Halls wrote:
> We have a problem with the attachment->size field. It's a 32 bit integer in 
> the PST file but is a size_t in libpst.h. That means on a 64 bit machine you 
> have difference sizes when doing a memcpy of size sizeof(size_t).
> 
> To fix this we can either change the size field to be of type int32_t or 
> convert to 64 bit when setting the field. I reckon it is better to keep the 
> size_t field and do the conversion when reading the PST file:
> 
> That way we keep the original ABI, the size_t field is the preferred type for 
> the processor and we have room for expansion later should it become possible 
> to attach files > 2GB in the future.
> 
> There are a few more size_t fields in libpst.h so I guess we might need to do 
> a similar thing there too.

If attachment->size in the PST is a 32 bit integer, then neither
proposal is the right way to go, IMHO. When given a choice between two
alternatives, I always choose Buzz Lightyear - there's always a third
answer... :->.

If in the file the attachment size is 32 bits, we should use an unsigned
32 bit integer: If we define it as int32_t, then we're almost guaranteed
to encounter problems forcing a 32 bit non-negative value into a signed
data type.

Using size_t is a misuse of the data type: size_t is defined in relation
to sizeof(). IMHO it should be used only for data types and functions
that have to do directly with its purpose, e.g., architecture-specific
references to data type sizes, etc. Even if MS defined it that way, it's
probably the wrong thing to do - refer to versioning comments and MS
magic below.

Keeping the original ABI only makes sense if the ABI makes sense: If
using size_t in the ABI was a design or implementation error, that error
should be corrected.

If we want a layer of abstraction to support larger attachments at some
future time - and this *is* a good idea, you never know what MS is going
to throw at you, always best to be wearing your teflon-coated,
kevlar-based exposure suit - then we should wrap the underlying data
type in whatever conditionals will apply, e.g.,

#if PSTVERS >= x.y || ARCH_IS_BIG // or whatever, you get the idea
#define	ATTACH_SIZE uint64_t
#else
#define ATTACH_SIZE uint32_t
#endif

In practice, the conditionals in the #if could be much more complicated,
since we have to deal with the case where the PST file comes from a 32
bit OS/machine and is being manipulated on a 64 bit machine, e.g., where
I use a shiny new screaming AMD as my Linux box and bring all my data
files over from my old slow 32 bit Winny. (Actually, that wouldn't be so
bad, the attachments would fit, but the code would be
inefficient... ...but the inefficiency likely would not be noticed.)

The best approach is likely to detect the PST version at run-time and
make appropriate choices in underlying code.

Having an attachment > 2GB will require a change to the PST structure.
This will require either specific PST versioning or MS-magic (some
bizarre code to detect these PSTs). These PSTs will require a new
libpst.

So the pragmatic approach (teflon and kevlar) is to detect PST version
at run time and return an error if the PST version does not match one we
know about it.

If we recognize the PST, we work on it with 32 bit attachments.

When biggiePSTs come along, we work on the underlying code, abstracting
all of this stuff out, and bullet-proofing against pathological cases
(biggiePSTs with attachments > 2GB being processed on boxes that do
support only files < 2GB, e.g.).

Finally, using size_t and playing conversion games is, IMHO, simply
going to complicate the code for no good reason. It also strikes me as
the kind of thing MS would do: Use the wrong data type then throw
complicated code at the problem to handle future exceptions to past
design.

Of course, this is all FWIW, and I probably haven't had enough coffee,
so YMMV. $0.02 doesn't go as far it used to....

pww

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.alioth.debian.org/pipermail/libpst-devel/attachments/20070211/3c8060f6/attachment.pgp


More information about the Libpst-devel mailing list