This is useful in the creation of an undo stack An example is the undo stack in vscode This is useful when different things ⩩ in a system are designed to interact with other things ↭ that may change over time and we want to maintain a consistant symbol to access that meaning. ⩩ can still interact with the version of ↭ it has been proscribed to act with.
We could store the data over and over again throughout each version A potential problem is the size of each storage if we are saving everything. An alternative would be to save the initial whole data and then save changes upon it from there. An issue with delta changes could be the time it takes to calculate the desired version that can be mitigated by updating the baseline to a specific version
Delta changes
What can this be? It could be the addition of a data field, the removal of a data field or the change of a data field on the top level. Could this be done for nested data e.g.: Here we are storing the Module at the top level
Module(
body= [
Assign(
...
),
Assign(
...
)
Class(
body= [
Assign(
...
)
]
)
]
)
We then add another assignment to the body of the class definition
Module(
body= [
Assign(
...
),
Assign(
...
)
Class(
body= [
Assign(
...
),
Assign(
...
)
]
)
]
)
Instead of recording it as a change upon the body attribute of the module data it could be recorded as a change on the body attribute of the module.body[ 2].body
data
The ability to do this would add a lot of weight to storing delta changes.
Such comparison of the new and the old would have to occur anyway when constructing code that updates instanciated data from one version to another in Bulk data editing actions
We could also store additional information on how to implement delta changes. A reason for this is the desire to store large pieces of text and to avoid the large file sizes from repeated whole stores which would be the case if only the above were interested. How pieces of data may internally be updated is a matter of their type. In fact we could use attribute equality based delta changes mentioned above as a fallback for if a custom method of doing so is not provided as mentioned for strings.
Commit data
- ID
- Timestamp - Format seconds?
- User name?
- Computer name?
- System information?
I dont think it seems right to enforce one to state the user and computer but it seems useful and a sensible default. System information may be more relevant for bug reporting
The id value could be an incrementing number for each version but in some use cases it is good to be able to discern between version iteration 5 from one store and version iteration 5 from another without having to state a consistent store. The only use for this I know of so far is when referencing which specification version you are following
versionInformtion= Optional[ bool| tuple[ str]]
So to update the record the design of the commit data is sitting at Commit id: str User: Optional[ str] Computer name: Optional[ str] Timestamp: float ( stored in seconds, first version is the time since 1970, 1, 1 and subsequent versions are since the ) Commit type: Literal[ INIT, DELTA, FULL]
So versioning can be implemented differently by the store as different store structures could make good use of the freedom
So the places in the code base in which the versioning system concretely exists is within the store interface protocol Maybe in other places to such as the LongTermStorageData I dont think anywhere else
So I want to be able to load multiple versions into memory at once? well different things could reference different versions that are running at once so yes i think so.
So we create our ltsd, we save this creates it in our store so we make some changes to the data and save again What happens now Well I want versioning to be off by default so it just overwrites the existing save
Now I have something that I want to be versioned well I create it and then I save, this creates the data in our store in the form of the first version. I now edit the data and save again creating a second version I now load the first version into memory again I edit the first memory rep and save
I think saving over versions within themselves defeats the point of versioning it would break the interaction point problem we intended on fixing So when we save again we create a new version
Now
Delta saving mechanism
This should be designed in tandem with Bulk data editing actions# ^9e834f
Delta change discernment and application design
So I think this works well however do we allow switching between versioned and not versioned?
So it works fine if we specify it solely at or before ltsd first save Perfectly possible but maybe too much responsibility so i think we should specify whether it is versioned at ltsd instanciation with a default to false
Application of a version history to find a version
So it isnt as simple as I thought it was. I thought that to find any version that I could find the closest full version and then apply deltas from that to the desired version. This is inadequate as:
- Delta changes on a specific version are in reference to a conversion to and from ( the version before themselves) and ( themselves). Following this convention, in order to convert backwards from a full, delta information would have to be stored alongside it.
- Full data is currently only dumped when a delta change isnt possible.
There is reason however to store full data alongside a delta and thats in order to reduce loading time if data has a large number of complex delta changes. The question is how does one go about creating the full with delta revisions. I think it should be at least a little automatic. An idea is to time the retrieval of data and if it is over some threshold, maybe a threshold respective of its binary size, then we should create a full entry with a delta. If we wanted to then save data we could delete some fulls provided that we ensure the path is still fully walkable, maybe this would just be the init
We know we can always convert forwards and we can only convert backwards if a full data has been saved alongside a delta
So the modifications I must do to this function are that we can only detect an afterIndex if it has a type of CommitType.FULL_AND_DELTA
So choice, do we take preprocessed only nescessary version data or do we do it ourselves, Doing it ourselves may require an outside source to deserialise more than is nescessary and taking only what is nescessary may shift a lot of work onto
We need to take a list of versionData What is nescessary, so we have our chain. It starts with an initial commit of the whole data and then a sequence of delta changes and full commits depending on what was possible. So upon recieving the whole list, a programatic aspect should discern which entry we want If the entry we want is a full commit then we can just return that If the entry we want is a delta change then we can look at the closest full commit by it and also possibly look at the total number of delta changes to reach it from ither side
""" [ 23, 6543, 2332, 6776, 3, 790] [ 0, 1, 2, 3, 4, 5 ] 6
2 is desired 1 is nearest full
the deltas of 2 describe the movement from 1 to 2 sum up transforms from [ 2: 3] == [ nearest+ 1: desired+ 1] and apply them forwards on 1
2 is desired 5 is nearest full the deltas applied backwards of 5, 4, 3 describe the movement from 5 to 2 sum up transforms from [ 3: 6] == [ desired+ 1: nearest+ 1] not quite: 5 must be full with delta 3, 4, 5 describe enough to translate back to 2, 5s transforms are obtained in a different manner it is [ 3: 5]+ transform of 5 so [ desired+ 1: nearest]+ transform of nearest """
What of an quick memory prescences version
I have the case where the deisgn is currently structured in a way that we construct a version upon construction it gets an initial version number, this is used to save the initial version and is kept with the quick memory representation upon a save we construct a new id, save with that id and keep that with the quick memory representation When pulling from the store we can specify a versionId and if that is in memory we give you that and if it isnt then we load it in and give it to you
A problem with this is if we pull the latest version, make some changes, require the latest version again, If we run the getLTSD function then we will recieve the in memory version If we run the _getLTSDComponents function then we can avoid the problem however we may encounter a scenario where we need to load in LTSD of the latest saved version whilst there is already LTSD created from the latest version in memory.
So my proposal is to create a new version id whenever we pull ltsd from the store into memory which is stored on the ltsd and is also used as the index in the stores memory. It can still store the id that it was loaded from in case we want to use the reload function. Upon saving it is still diffed against the latest present version in the store even if we have pulled from a version prior to the latest.
So to run over desired ltsd interactions:
intialCreation-> create versionId, loaded set to seperate new id. in dataInit: saved to quickMemMap using versionId, initial version saved using loaded id
This is another big rework and why?
So it may be clapped but the first way is how were rolling for now