RESTful API for SAN/NAS system

2018-06-25 19:32:58

I need to build RESTful API for my files on cluster filesystem volumes. I have like 20 servers which do share the same filesystems. All I need is RESTful API services which would allow me to stat(), read(), write(), listFolder(), delete(), setacl() etc. Everything else is handled by cluster filesystem, so I just need to have the above functions. I need something which is pretty much mature so it supports access control lists, it has high performance API (like java ones), the library or project is maintained, and it runs Linux, also locking support would be very useful. I would like to put additional functions myself like getDuration(), so if it's open source that would be advantage. If you are aware of such code which would help me to build something like this I would be very grateful.

The purpose of it is to allow BPM system to check if the files are OK on the various Stornext volumes. Since these systems are behind various firewalls and mouting NFS or SMB is not really good because of high availability, the best option seems to be RESTful API as single source to all file operations between firewall zones in some convenient way via HTTP(S) request instead of doing NFS or SSH.

If you want a very generic web based API to manipulate files

Look into design of WebDAV api It's OK if you dont want to use it AS IS, you just look into it as an API inspiration. Look how stat() , listFolders() and setacl() might be just one command. If you looking into something time-tested - this is the one. This API was designed for web based file access, people put some wrappers around it to have it mountable just like any other file system - see davfs2, to me it's a proof of a solid and complete API.

Now presuming you don't want full DAV - but something simpler, then I'd look into some libraries which can help me built a similar API. Check out these: Jackrabbit WebDAV Library, milton.io. There is also of course is Jigsaw project to steel code from. Use them to expose your ad-hoc APU or a selection of StorNext API calls over http.

If you want a less generic API to manipulate blobs

Check out Amazon S3 API as an inspiration, and a code like littles3 as an implementation example. There are plenty of projects like this, check this search

Notice how what you want falls in between what is already available:

webDAV (full stack, from API to server implementations), which hides and abstracts out underlying file system. Very high level, so you can't take advantage of StorNext features

StorNext API which is very low level, so that no suitable web layer exists

If you want an API tailored to your domain

Typically when faced with a similar challenge, like yours, people leverage their domain knowledge and use-cases. If you need this API for pictures storage and retrieval forget generic file operations and model your API around collection of images. You know up front a lot of information which make API design a much simpler job, for instance:

min/max/average file size

usage patterns, read/write i/o

no need of streaming

immutability of the file content (no incremental changes)

etc

Have you looked at rails-api? I'm not sure if it supports all the functions you need but is maintained and open-source.

https://github.com/rails-api/rails-api

You could also include a ruby gem to handle access control lists.

https://www.ruby-toolbox.com/projects/acl9

I'd recommend looking into a WebDAV implementation -- they're usually integrated into a web server (like Apache) and support most of the standard filesystem operations you require.

If you really want to build it yourself, you could also fire up an object storage platform like OpenStack's "Swift" project, backed by your SAN or NAS appliance over NFS/iSCSI.

EDIT : You want to store a large number of photos. There are various NoSQL databases that would also solve this problem. However you could also solve the problem using the a native network filesystem protocol like NFS.

NFS will perform predictably well (v4.1+ anyway) on the majority of your typical read-and-write filesystem operations. However, you'll also need a way to index and retrieve photo metadata and provide access control mechanisms, and those are where performance can get complicated.

When the file is uploaded to your HTTP API, you should calculate the MD5 hash of it's contents, while storing the original file name, owner UID and other metadata in a relational database. Then write the photo to your NFS mount in a specific "bucket".

For example, assume you have a photo whose content has the MD5 hash: e240a38624f4a370bd2ec65cf771134b . Assuming your NFS mount is at /srv/content you would write the photo to the path /srv/content/e240/a38624f4/a370bd2ec65cf771134b.jpg -- splitting the MD5 hash to create prefixed folders.

When your user later wants to retrieve the image, they can request it via the data stored in the relational database, your API can look up the photo's MD5 hash, and then locate it on the filesystem using a similar operation.

Please be aware that using MD5 could result in collisions if you have a very large number of differing files, so you may want to use another hashing scheme or a combination of two or more to prevent that from occurring.

链接地址: http://www.djcxy.com/p/72236.html

上一篇: 渲染用于部分视图的脚本？

下一篇: 适用于SAN / NAS系统的RESTful API