Close

Attributes moved to Flex

A project log for aStrA : Aligned Strings format with attributes

Developing a more flexible pointer and structure format that solves POSIX/C's historical problems.

yann-guidon-ygdesYann Guidon / YGDES 09/07/2024 at 01:420 Comments

I'm reviewing and updating and rewriting the core files and now, something appears. The UTF-8 attributes can be restricted to the Flex Types 1 and 2 because their "allocated" field is also limited to (approx.) 8 and 16 bits. This leaves 15 MSB for other uses and the attributes are perfect. These do not need to be in the trailing/terminator byte, which is quite cumbersome to compute.

The trick is that "fixed" strings are mostly constants and their contents are already known (more or less). We trust the compilers enough to create the appropriate utf-8 sequence, right ? The purpose of the attributes is to help filter external data, that are not yet known and are likely to have errors. Usually the size is not known in advance so they are received in a Flex buffer, hence why it's not required to put the attributes in a fixed format.

During reception, the incoming data is processed with 2 methods:

In the first case, the attributes are reconstructed by a simple scan, the second case can merge de scan and the re-aligning string copy.

...

This changes quite a few things, as I have already started removing the optionality of the NULL terminator. Again, it becomes useless. And it's good to speed up the computation of the address of the attribute, because the code was getting awkward, almost mooting the point of the Aligned Strings in the first place.

Discussions