How to structure database for quick node access

I am looking for a way to structure database wit VirtualTreeView and SQLite database for quick retrieval of data. With VirtualTreeView there is a OnNodeInit event bu it is not always practical for this purpose.

The data is fetched from Usenet newsgroups and needs to be threaded. Data useful for threading is post id (int64, also primary key), references (strings that refer to previous posts in thread).

The program searches for strings in references and determines under which postid it should go. So for example post id = 1234, then next post might be 1235, and then 1236 might be reply to 1234.

Here is a possible database example:

post id    references    parent id
  1234      .... ....       0
  1235      .... ....       0
  1236      .... ....      1234

So now this is how it looks right now.

Now, the problem is how to structure this data for faster retrieval. If there is only a root node I can assign RootNodeCount based on database entries and then in OnNodeInit read them one by one as requested. When having sub-nodes then I need to somehow rearrange database so that it knows how to get subnodes faster depending on which node is opened.

I was thinking to assign additional field "has_subnodes" with ID of sub-node that follows. When a node is clicked it then reads that node and every linked node.

How would you organize this database so it could be read nicely in OnNodeInit or would you use that event at all? Nodes may be initiated using AddChildNoInit() method too. Any ideas or pointers would be welcome.

UPDATE (AND HOW I SOLVED IT)

There is some non-virtualtreeview related information available here: Implementing a hierarchical data structure in a database

What I ended up doing is using Modified Preorder Tree Traversal to store information in database about nodes and every time a certain node is requested first:

a) it is looked up in internal cache that basically holds the identical structure to VirtualTreeView structure.

b) if found in cache, this cache entry is removed (it never holds more than 100 items)

c) if not found, additional 100 items are added in cache (50 up from requested node, and 50 down). This number of course can be modified to 500 or 1000 items if needed. There are some additional checks to see how much up/down it needs to read to avoid reading too much of duplicate entries.

d) if i need more speed, i may apply additional technique - load nodes from database based on how much a user scrolls virtualtreeview - similar to how std::vector allocates memory - first I load only 100 nodes, then if user scrolls a lot, i load 200, then 400 etc... the more user scrolls the faster it loads entire tree but still doesn't load it if he/she never scrolls.

This way, the nodes that are never seen are never loaded from database. It works fine for scrolling with mouse wheel (with occasional short delay when it passes the point where cache is empty and needs more data from disk) and for scrolling with arrow buttons/keys. It is a bit slower when you drag scrollbar to certain position (say from bottom to the middle) but that is expected as data cannot be fetched from disk instantly.

It is best if I pre-determine how much memory I want to use up for cache/items before loading them, the more the faster scrolling is but of course then it uses more memory if the data is never displayed.


You are looking to store hierarchical data in a database.
The problem is that SQL is not equipped to deal with this kind of data very well.

You have a number of solutions, each with their cons and pros.
Here's a link if you want to read up on each of the approaches:

http://www.sitepoint.com/hierarchical-data-database/
http://www.sitepoint.com/hierarchical-data-database-2/

My personal favorite is Modified Preorder Tree Traversal

Here you store left and right node in the database in a very counter intuitive way, which makes insertions of nodes a bit slow, but retrieval lightning fast.

You can code your logic in Delphi, but I prefer to use stored procedures in my database of choice.
That way your logic in Delphi stays simple and if the database change your Delphi code does not have to. If you want I can include SQL code for the stored procedures, but not right now, because that code is not on the laptop I've got with me now.


Not the most elegant but this is the method i use to populate my trees.

It only requires data acess for two simple queries, and the rest is all done client side.

It will load tens of thousands of nodes with ease. (looking at it now, i could probably get away with just one query - its a bit old!):

 procedure TFrameComponentViewer.LoadComponentTree;
var
RootNodeData : PMasterComponent;
CompQ,ParentQ : TMyQuery;

procedure PopulateNodeData(Node: PVirtualNode;ComponentID : integer);
var NodeData : PMasterComponent;
begin
   if CompQ.Locate('ComponentID',ComponentID,[loCaseInsensitive]) then
   begin
     NodeData := TreeComponents.GetNodeData(Node);
     //Populate your desired TreeData
     NodeData.ComponentID := CompQ.Fields[fldComponentID].AsInteger;
     NodeData.ComponentCode := CompQ.Fields[fldComponentCode].AsString;
     NodeData.ComponentType := CompQ.Fields[fldComponentType].AsInteger;
     NodeData.IsPipeline := CompQ.Fields[fldComponentIsPipeline].AsBoolean;
     NodeData.Description := CompQ.Fields[fldComponentDescription].AsString;
     NodeData.StartKP := CompQ.Fields[fldComponentStartKP].AsFloat;
     NodeData.EndKP := CompQ.Fields[fldComponentEndKP].AsFloat;
     NodeData.Diameter := CompQ.Fields[fldComponentDiameter].AsFloat;
     NodeData.WallThickness := CompQ.Fields[fldComponentWallThickness].AsFloat;
     NodeData.CriticalSpanLength := CompQ.Fields[fldComponentCSL].AsFloat;
     NodeData.Historical := CompQ.Fields[fldComponentHistorical].AsBoolean;
   end;
end;

procedure AddNodesRecursive(ParentNode : PVirtualNode;ParentNodeID : Integer);
var AddedNode : PVirtualNode;
AddedNodeData : PMasterComponent;
Children : Array of Integer;
i : Integer;
begin
     try
        ParentQ.Filtered := False;
        ParentQ.Filter := 'Parent_ID = '+InttoStr(ParentNodeID);
        ParentQ.Filtered := True;
        ParentQ.First;
        SetLength(Children,ParentQ.RecordCount);
        for i:=0 to ParentQ.RecordCount-1 do
        begin
             Children[i] := ParentQ.Fields[0].AsInteger;
             ParentQ.Next;
        end;
        for i:=0 to High(Children) do
        begin
             AddedNode := TreeComponents.AddChild(ParentNode);
             AddedNodeData := TreeComponents.GetNodeData(AddedNode);
             System.Initialize(AddedNodeData^); //initialize memory
             PopulateNodeData(AddedNode,Children[i],CompQ);
             AddNodesRecursive(AddedNode,AddedNodeData.ComponentID);
         end;
     finally
     end;
end;

begin
   TreeComponents.BeginUpdate;
   treeComponents.Clear;
   CompQ := TMyQuery.Create(nil);
   ParentQ := TMyQuery.Create(nil);
   try
      CompQ.Connection := DataBaseline.BaseLineConnection;
      CompQ.SQL.Add('SELECT * FROM Components');
      CompQ.Open;
      ParentQ.Connection := DataBaseline.BaseLineConnection;
      ParentQ.Close;
      ParentQ.SQL.Clear;
      ParentQ.SQL.Add('SELECT ComponentID,Parent_ID FROM Components ORDER BY OrderNo');
      ParentQ.Open;
      RootNode := TreeComponents.AddChild(nil);
      RootNodeData := TreeComponents.GetNodeData(RootNode);
      System.Initialize(RootNodeData^); //initialize memory
      RootNodeData.ComponentID := -1;
      AddNodesRecursive(RootNode,-1);
   finally
     TreeComponents.EndUpdate;
     TreeComponents.FullExpand;
     CompQ.Close;
     ParentQ.Close;
     FreeandNil(CompQ);
     FreeandNil(ParentQ);
   end;
end;

Note: the OrderBy column is optional, i require it as my trees are order specific.

So the DB has these three columns, plus any custom data you require:

ID , ParentID (-1 for no parent), OrderNo

链接地址: http://www.djcxy.com/p/5968.html

上一篇: VirtualTreeView在C ++ Builder中为UnicodeString完成

下一篇: 如何构建数据库以实现快速节点访问