[NEW] Localisable ENUM Type

Ladies and Gentlemen!  
The first time in the world!  🙂
Localizable Enum Type in DBMS!  

We already many months have working ENUM type in 5.0 branch of Valentina. Let me remind that ENUM type is not from SQL standard, so different DBs implement it in different way if at all implement. We have implemented it using CREATE TYPE command of SQL Standard. And we have implement ENUM in way similar to PostgreSQL, because it is the most correct: you just CREATE TYPE ENUM once and later using it in all places of your database.

mySQL, in contrast, defines ENUM as part of a particular Table, right in the CREATE TABLE command. This is not good of course, because then you cannot use this type in other tables or for variables of Stored Procedures.


CREATE TABLE sizes (
name ENUM('small', 'medium', 'large')
);

It is interesting that such mature database as Oracle do not have ENUM type.

All these existed implementations have one big problem from our point of view: such enum types contains string values of only one language. Below we will describe our solution.

Continue reading [NEW] Localisable ENUM Type

Join speed improved

After some benches we have discover that loop on small joins (when only one record on the left and few on the right) is not fast enough.

We have two major algorithms internally and we have discover that the first takes only 50 seconds on 10000 loops, while the second takes hundreds of seconds…So problem presents in the second algorithm.

Improved.

Now the second algorithm takes 120 second. And we can add yet choice condition to choose the first for such case of small joins…

I think we will be able to improve the second alg yet to 70-80 sec. And may be with more complex changes it is possible to speed both after that …

Choosing appropriate database’s segment size

Each Valentina’s volume consists of set of segments, even internal service-data placed in such segments.

Database storage is implemented similar to some file system. There are volumes (.vdb, .dat, .blb, .ind, .tmp) and there are some embedded files on that volumes (all the data like field-data, indexes and so on go to that files). Each volume operates with own segment map – so we can find (and allocate new) segments for particular embedded file easy and fast.

Continue reading Choosing appropriate database’s segment size

[NEW] ORDER BY in aggregative functions First()/Last()

SQL Standard allows to SELECT only fields mentioned in the GROUP BY and expressions based on aggregative functions. You cannot SELECT a normal field. But sometimes you may very want to do this. Question is what to do in this case.

In this new Valentina Wiki article, we have describe in detail this problem and gave THREE solutions. The third solution is new for v5.0 and it works x400/ x50 times faster of the first two correspondently in tests on the database of our customer.

The third solution, uses idea from ORACLE database actually: FIRST()/LAST() functions with own ORDER BY to be used inside of each group. It seems mySQL and PostgreSQL do not have any way to resolve this task in such effective way.

New Docs About ENUM Type in v5.0

Today I added new reference pages to the wiki of e.g. SQL grammar about a new feature of Valentina DB for 5.0: the ENUM type.

This is the first step towards a CREATE TYPE command based on the SQL 2003 standard. Next we can add composite types, collection types and inheritance.

We think that Valentina can have a more effective and flexible ENUM implementation compared to PostgreSQL or mySQL. One example is that Valentina can use 1 byte to store an enum value, while mySQL – 2, PostgreSQL – 4.

We are finishing now integration of this feature into Valentina Studio for both Valentina DB and PostgreSQL plugins. Database types in Valentina Studio are supported through a plugin architecture.

These are major entry points in the Valentina Wiki to check:

[DEPRECATED] I_Field::put_ID()

We have note that put_ID() should not be in the public interfaces of Schema Objects, because developer should not be able change ID of a scheme object.

We have check our sources, and found that only I_Field interface did have such method. So we move it into internal I_FieldEx interface. In the C++ header this method marked as DEPRECATED, and do nothing now. We believe that nobody from C++ developers have used it.

We will consider future ability to do “SET PROPERTY ID of Object name” via SQL, but this will be recommended for use by our own SQL scripts only.

[NEW] VLink.Count Property

We have discover that interface of VLink miss Count property to easy ask VLink how much pairs it contains.

Now this property is implemented in the vkernel. It works for all kinds of links (Binary, ObjectPtr and ForeignKey). It is implemented in effective way — only header of index is asked actually.

Adding of this property do minor extension of VSERVER protocol also.

Index search warnings

There are many find-like methods which are able to employ indexes to speed up the searching (FindLike, FindRegEx, FindContains, and so on). Of course, the field must be  indexed and search preference is set to kPreferIndexed.

But currently there are two issues when index search may bring a wrong result:

1.Index keeps only first 256 symbols of the value – so rest of the symbols are just ignored in such a search.

2. Index-By-Words index and multi-word searching pattern.

Some of the methods like FindRegEx don’t use an index-search but scan the table instead if any of the above occurred. The rest of the methods are less strict choosing the searching algorithm. Now you will get the warning in such cases and you can switch algorithm to not-index search (kPreferNotIndexed) if it is relevant (less speedy but more accurate). Also you should consider to change index type of such a lengthly fields to the index-by-words.
Warning examples:
“Index search in the first 256 symbols only : FindContains t1:f1”
“Index-By-Words search : FindContains t1:f1”

Altering ObjectPtr

ObjectPtr field contains RecID values of some “target” table. If you need to change ObjectPtr in order to point another table those values are not valid anymore – so default behavior is to nullify them.  But sometime it looks to be too strong. For instance – changing  field or link name or on-delete policy should not cause such data zeroing.

There are two ways to keep existed values altering ObjectPtr:

  1. There is a global property – OnChangeTypeZeroPtr. It is set to true by default. But you may change it with following statement – “SET PROPERTY OnChangeTypeZeroPtr TO false”. After this altering ObjectPtr will prevent zeroing data (sure it will be nulls for records which are absent in the new target table).  This property is not “schema-storable” – so you should set it up on each run if you need it.
  2. The first approach is a global setting – so you will be able to keep some ObjectPtr values even changing “target” table. So we implement one more way – much more simple because the only thing you should keep in your mind is – the values will be preserved until you touch “target table” ObjectPtr property (because all values will be valid for the same target anyway).