1
0
Fork 0
mirror of https://github.com/chrislusf/seaweedfs synced 2024-06-28 21:31:56 +02:00

Update README.md

This commit is contained in:
Chris Lu 2018-10-17 18:44:49 -07:00 committed by GitHub
parent f3f0f9750a
commit f4ad5b0579
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -280,16 +280,8 @@ All file meta information on volume server is readable from memory without disk
Most other distributed file systems seem more complicated than necessary.
### Compared to Ceph ###
Ceph can be setup similar to SeaweedFS as a key->blob store. It is much more complicated, with the need to support layers on top of it. [Here is a more detailed comparison](https://github.com/chrislusf/seaweedfs/issues/120)
SeaweedFS is meant to be fast and simple, both during usage and during setup. If you do not understand how it works when you reach here, we failed! Please raise an issue with any questions or update this file with clarifications.
SeaweedFS has a centralized master to look up free volumes, while Ceph uses hashing to locate its objects. Having a centralized master makes it easy to code and manage. HDFS/GFS has the single name node for years. SeaweedFS now support multiple master nodes.
Ceph hashing avoids SPOF, but makes it complicated when moving or adding servers.
### Compared to HDFS ###
HDFS uses the chunk approach for each file, and is ideal for storing large files.
@ -304,9 +296,9 @@ SeaweedFS can also store extra large files by splitting them into manageable dat
The architectures are mostly the same. SeaweedFS aims to store and read files fast, with a simple and flat architecture. The main differences are
* SeaweedFS optimizes for small files, ensuring O(1) disk seek operation, and can also handle large files.
* SeaweedFS statically assign a volume id for a file. Locating file content becomes just a lookup of the volume id, which can be easily cached.
* SeaweedFS Filer metadata store can be any well-known and proven data stores, e.g., Cassandra, Redis, MySql, PostGres, etc, and is easy to customized.
* SeaweedFS Volume server also communicate directly with clients via HTTP, supporting range queries, direct uploads, etc.
* SeaweedFS statically assigns a volume id for a file. Locating file content becomes just a lookup of the volume id, which can be easily cached.
* SeaweedFS Filer metadata store can be any well-known and proven data stores, e.g., Cassandra, Redis, MySql, Postgres, etc, and is easy to customized.
* SeaweedFS Volume server also communicates directly with clients via HTTP, supporting range queries, direct uploads, etc.
| System | File Meta | File Content Read| POSIX | REST API | Optimized for small files |
| ------------- | ------------------------------- | ---------------- | ------ | -------- | ------------------------- |
@ -323,10 +315,18 @@ GlusterFS hashes the path and filename into ids, and assigned to virtual volumes
### Compared to Ceph ###
Ceph can be setup similar to SeaweedFS as a key->blob store. It is much more complicated, with the need to support layers on top of it. [Here is a more detailed comparison](https://github.com/chrislusf/seaweedfs/issues/120)
SeaweedFS has a centralized master group to look up free volumes, while Ceph uses hashing and metadata servers to locate its objects. Having a centralized master makes it easy to code and manage.
Same as SeaweedFS, Ceph is also based on a object store RADOS. Ceph is rather complicated with mixed reviews.
Ceph uses CRUSH hashing to automatically manage the data placement. SeaweedFS places data by assigned volumes.
SeaweedFS is optimized for small files. Small files are stored as one continuous block of content, with at most 8 unused bytes between files. Small file access is O(1) disk read.
SeaweedFS Filer uses off-the-shelf stores, such as MySql, Postgres, Redis, Cassandra, to manage file directories. There are proven, scalable, and easier to manage.
| SeaweedFS | comparable to Ceph | advantage |
| ------------- | ------------- | ---------------- |
| Master | MDS | simpler |