Sabtu, 07 Juli 2018

Sponsored Links

Object Storage - the key to Cloud and Big Data - YouTube
src: i.ytimg.com

Object storage (also known as object-based storage ) is a computer data storage architecture that manages data as an object, compared to other storage architectures such as file systems that manage data as file hierarchies, and block the storage that manages the data as blocking in sectors and tracks. Each object typically includes the data itself, a number of metadata variables, and a unique global identifier. Storage objects can be implemented at various levels, including device level (object storage device), system level, and interface level. In each case, object storage attempts to enable capabilities not handled by other storage architectures, such as a programmable interface, a namespace that can reach multiple instances of physical hardware, and data management functions such as data replication and data distribution. on the item level breakdown.

The object storage system allows storage of large amounts of unstructured data. Storage objects are used for purposes like storing photos on Facebook, songs in Spotify, or files in online collaboration services, such as Dropbox.


Video Object storage



Histori

Origins

In 1995, research led by Garth Gibson on the first Network-Attached Secure Disk promoted the less common concept of separation operations, such as namespace manipulation, from common operations, such as read and write, to optimize performance and scale both. In the same year, 1995, a Belgian company - FilePool - was established to build a base for archiving functions. Storage of objects was proposed at the Carnegie Mellon Gibson University lab as a research project in 1996. Another key concept was to abstract the writing and reading of data into more flexible data containers (objects). Smooth grained access control via the object's storage architecture is further explained by one of the NASD teams, Howard Gobioff, who later was one of the inventors of the Google File System. Other related occupations include the Coda filesystem project at Carnegie Mellon, which began in 1987, and spawned the Luster file system. There is also an OceanStore project at UC Berkeley, which began in 1999.

Centera made its debut in 2002. The so-called content-addressable storage technology, developed at Filepool, was acquired by EMC Corporation in 2001.

Development

From 1999 to 2013, at least $ 300 million of venture finance is linked to object storage, including vendors such as SwiftStack, Amplidata, Bycast, Cleversafe, Cloudian, Nirvanix, and Scality. This does not include engineering from system vendors such as DataDirect Networks (WOS), Dell EMC Elastic Cloud Storage, Centera, Atmos, HDS (Hitachi Content Platform (HCP)), IBM, NetApp (StorageGRID), Redhat GlusterFS, cloud service vendors such as Amazon ( AWS S3 in 2006), Microsoft (Microsoft Azure), Oracle (Oracle Cloud) and Google (Google Cloud Storage in 2010), or open-source development in Luster, OpenStack (Swift), MogileFS, Ceph and OpenIO. An article that illustrates a product timeline has been published in July 2016.

Maps Object storage



Architecture

Storage abstraction

One of the design principles of object storage is by abstracting some of the lower storage layers of administrators and applications. Thus, data is exposed and managed as objects, not files or blocks. Objects contain additional descriptive properties that can be used for better indexing or management. Administrators do not have to perform lower level storage functions such as building and managing logical volumes to utilize disk capacity or RAID level settings to handle disk failures.

Storage objects also allow addressing and identification of individual objects with more than just file names and file paths. The object storage adds a unique identifier in the cart, or across the system, to support a much larger namespace and eliminate name collisions.

Inclusion of rich, custom metadata in

objects

Object storage explicitly separates the metadata of files from the data to support additional capabilities. Unlike the metadata fixed in the file system (filename, creation date, type, etc.), Storage object provides full function, custom, object-level metadata to:

  • Collect application-specific information or user-specific information for better indexing purposes
  • Support data management policies (eg policies to encourage moving objects from one storage level to another)
  • Centralize storage management across multiple individual nodes and clusters
  • Optimize metadata storage (e.g., encapsulation, database, or key value storage) and cache/index storage (when authoritative metadata are encapsulated with metadata within objects) separately from data storage (eg unstructured binary storage)

In addition, in some object-based file system implementations:

  • The file system client only contacts the metadata server once when the file is opened and then gets the content directly through the object's storage server (compared to the block-based file system that will require constant metadata access)
  • Data objects can be configured on a per-file basis to allow adaptive line width, even across multiple object storage servers, support optimization in bandwidth and I/O

Object-based storage devices ( OSD ) and some software implementations (e.g., Caringo Swarm) manage metadata and data at the storage device level:

  • Instead of providing a block-oriented interface that reads and writes a fixed data size block, the data is organized into a flexible data-size container, called an object
  • Each object has both data (unedited byte sequence) and metadata (a set of expandable attributes that describe the object); physically encapsulates both with the benefits of recovery.
  • The command interface includes commands to create and delete objects, write bytes and read bytes to and from individual objects, and to set and get attributes on objects
  • The security mechanism provides access control per-object and per-command

Programming data management

Storage objects provide a programmable interface to allow applications to manipulate data. At the base level, this includes the Create, read, update and delete (CRUD) functions for read, write, and delete operations. Some implementations of object storage go further, supporting additional functions such as object modeling, object replication, life cycle management, and object movement between different levels and types of storage. Most of the ReST-based API implementations allow for the use of many standard HTTP calls.

How Object Storage can improve Hadoop Performance |
src: www.womeninbigdata.org


Implementation

Object-based storage devices

The object storage on protocol and device layers was proposed 20 years ago and approved for SCSI commands set nearly 10 years ago as "Object-based Storage Device Commands" (OSD), but has not been produced until Seagate Kinetic Open Platform storage development. The SCSI commands set for Object Storage Devices are developed by the Working Group of the Network Storage Industry Association (SNIA) for the T10 Committee of the International Committee on Information Technology Standards (INCITS). T10 is responsible for all SCSI standards.

Object-based file system

Some distributed file systems use object-based architecture, where file metadata is stored in the metadata server and the file data is stored on the object's storage server. The client file system software interacts with different servers, and abstracts them to present a complete file system for users and applications. IBM Spectrum Scale (also known as GPFS), Dell EMC Elastic Cloud Storage, Ceph, XtreemFS, and Lustre are examples of this type of object storage.

Archive storage

Some early incarnations of object storage are used for archiving, since implementations are optimized for data services such as conservation, not performance. EMC Centera and Hitachi HCP (formerly known as HCAP) are two object storage products that are typically cited for archiving. Another example is the Quantum Lattus Object Storage Platform.

Cloud storage

Most of the cloud storage available in the market takes advantage of the object's storage architecture. Some noteworthy examples are Amazon Web Services S3, which debuted in March 2006, Rackspace Files (whose code was donated in 2010 to the Openstack project and released as OpenStack Swift) and Google Cloud Storage was released in May 2010.

Storage object "Captive"

Some major Internet companies develop their own software when object-storage products are not commercially available or use very specific cases. Facebook famously created their own, code-named Haystack storage software, to address their specific large-scale photo-management needs efficiently.

Hybrid storage

Some object storage systems, such as Ceph, GlusterFS, Cloudian, IBM Spectrum Scale, and Scality support Unified File and Object (UFO) allow multiple clients to store objects on a storage system while simultaneously other clients store files on the same storage system. Although "hybrid storage" is not a widely accepted term for this concept, interoperable interfaces to the same data set become available in some object storage products.

Storage virtual objects

In addition to object storage systems that have managed files, some systems provide object abstraction over one or more traditional file system based solutions. This solution has no underlying raw storage, but actively reflects the file system changes and replicates it in its own object catalog, in addition to any metadata that can be automatically extracted from a file. Users can then contribute additional metadata through the virtual object storage API. The ability of namespace and global replication both in and around the file system is usually supported.

An important example in this category is Nirvana, and the source of its cousin iRODS.

Most products in this category have recently expanded their ability to support other Object Store solutions as well.

The object storage system

More general purpose object storage systems come to market around 2008. Lured by the tremendous growth of "captive" storage systems in web applications such as Yahoo Mail and the early success of cloud storage, promising scale object storage systems and cloud storage capabilities, with the ability to deploy systems within a company, or on an aspiring cloud storage service provider. Notable examples of object storage systems include NetApp StorageGRID, EMC Atmos, OpenStack Swift, Scality RING, Caringo Swarm (formerly CAStor), Cloudian, OpenIO, and Minio.

New Frontiers in Cloud Object Storage with EMC Elastic Cloud ...
src: blog.dellemc.com


Market adoption

One of the first object storage products, Lustre, is used on 70% of 100 supercomputers and ~ 50% of the Top 500. On June 16, 2013, this includes 7 of the top 10, including today's fastest systems. in the list - China Tianhe-2 and second fastest, Titan supercomputer at Oak Ridge National Laboratory (pictured on the right).

The object storage system had a good adoption in the early 2000s as an archive platform, especially after compliance laws such as Sarbanes-Oxley. After five years on the market, EMC Centera products claimed more than 3,500 customers and 150 petabytes shipped in 2007. Hitachi's HCP product also claims many petabyte-scale customers. Newer object storage systems also have some traction, especially around very large custom applications such as eBay auction sites, where EMC Atmos is used to manage more than 500 million objects every day. As of March 3, 2014, EMC claims to have sold more than 1.5 exabytes of Atmos storage. On July 1, 2014, Los Alamos National Lab selected RING Scality as the basis for a 500-petabyte storage environment, which will be one of the largest ever.

The object storage system "Captive" like Facebook's Haystack has been impressive scale. In April 2009, Haystack managed 60 billion photos and 1.5 petabytes of storage, adding 220 million photos and 25 terabytes a week. Facebook recently stated that it added 350 million photos per day and saved 240 billion photos. This can be equal to 357 petabytes.

Cloud storage has become pervasive as many new web and mobile apps choose it as a common way to store binary data. As backend storage for many popular applications such as Smugmug and Dropbox, AWS S3 has grown to a large scale, citing more than 2 trillion objects stored in April 2013. Two months later, Microsoft claims that they store more objects in Azure by 8.5 trillion. In April 2014, Azure claimed more than 20 trillion stored objects. Windows Azure Storage manages Blobs (user files), Tables (structured storage), and Queues (message delivery) and calculates them all as objects.

Monitoring IBM Cloud Object Storage as an external storage tier ...
src: i.ytimg.com


Market analysis

IDC has begun assessing the object-based storage market annually using the MarketScape methodology. IDC describes MarketScape as: "... a quantitative and qualitative assessment of the characteristics that assess the current and future vendor's success in that market or market segment and give them the measure of their influence to be a Leader or maintain leadership." IDC MarketScape Assessment is helpful in the country market developing that are often fragmented, have multiple players, and do not have a clear leader. "

In 2013, IDC ranked Cleversafe, Scality, DataDirect Networks, Amplidata, and EMC as leaders. In 2014, it ranked Scality, Cleversafe, DataDirect Networks, Hitachi Data Systems, Amplidata, EMC, and Cloudian as leaders.

Block, File and Object Storage Compared - Storage Talk - YouTube
src: i.ytimg.com


Standard

Object-based storage device standards

OSD version 1

In the first version of the OSD standard, the object is specified with a 64-bit partition ID and a 64-bit object ID. Partitions are created and deleted in the OSD, and objects are created and deleted in the partition. No fixed size is associated with partitions or objects; they are allowed to grow subject to the limitations of the physical size of the device or the logical quota restrictions on the partition.

The expandable set of attributes describes the object. Some attributes are implemented directly by the OSD, such as the number of bytes in an object and the time of modification of an object. There is a custom policy tag attribute that is part of the security mechanism. Other attributes are not interpreted by the OSD. It's set on an object by a higher-level storage system that uses OSD for persistent storage. For example, attributes may be used to classify objects, or to capture relationships between different objects stored in different OSD.

The list command returns the list of identifiers for objects in a partition, optionally filtered by a match with its attribute value. The list command can also return the selected attribute of the listed object.

Read and write commands can be combined, or supported by piggy, with commands to acquire and set attributes. This capability reduces the number of times a high-end storage system must cross the interface to the OSD, which can improve overall efficiency.

OSD version 2

The second generation of the SCSI command set, "Object-Based Storage Devices - 2" (OSD-2) adds support for snapshots, object collections, and improved error handling.

Snapshot is the point in time copies all objects in the partition to a new partition. OSD can implement space-saving copies using copy-on-write techniques so the two partitions share unchanged objects among the photos, or the OSD may physically copy the data to the new partition. Standard defines clones, writable, and snapshots, read-only.

Collections are special object types that contain other object identifiers. There are operations to add and remove from collections, and there are operations to get or set attributes for all objects in the collection. Collections are also used for error reporting. If an object becomes corrupted by the occurrence of a media defect (ie, a bad point on the disk) or by software error in the OSD implementation, its identification is inserted into a special error collection. Higher level storage systems that use OSD may request this collection and take corrective action as necessary.

OpenStack Swift Object Storage | Secure Infrastructure & Services
src: www.siasmsp.com


Difference between key and object store values ​​

Unfortunately, the boundary between object storage and key value storage becomes blurred, with key value stores sometimes loosely referred to as object stores.

The traditional block storage interface uses a fixed-size block set numbered from 0. The data must be of fixed fixed size and can be stored in a particular block identified by its logical block number (LBN). Then, one can retrieve data blocks by specifying a unique LBN.

With key value storage, data is identified by key rather than LBN. The key may be "cat" or "olive" or "42". This can be an arbitrary long arbitrary byte sequence. The data (called a value in this language) does not need a fixed size and can also be an arbitrary long arbitrary byte sequence. One stores data by presenting keys and data (values) to data storage and can then retrieve data by presenting a key. This concept is seen in the programming language. Python calls them a dictionary, Perl calls them hashes, Java and C calls them maps, etc. Some store data also implements key value stores such as Memcached, Redis and CouchDB.

The object shops are similar to key-value stores in two ways. First, the object identifier or URL (which is equivalent to a key) can be a random string. Second, the data can be a random size.

There are, however, some major differences between key value stores and object stores. First, the object store also allows one to associate a limited set of attributes (metadata) with each piece of data. The combination of keys, values, and set of attributes is called an object. Second, the object store is optimized for large amounts of data (hundreds of megabytes or even gigabytes), while for key values ​​keeps the expected value relatively small (kilobytes). Finally, the object store usually offers a weaker consistency guarantee such as its ultimate consistency, while key-value stores offer strong consistency.

Bulk Delete OpenStack Object Storage Swift Objects - YouTube
src: i.ytimg.com


See also

  • Cloud storage
  • Grouped file system
  • The object access method

Object Storage Api | Best Storage Design 2017
src: pbs.twimg.com


References


Bulk Delete OpenStack Object Storage Swift Objects - YouTube
src: i.ytimg.com


External links

  • AWS API Documentation S3
  • Google Cloud Storage API documentation
  • IBM Cloud Object Storage
  • Openstack Swift API Documentation
  • Seagate Kinetic Open Storage Documentation
  • Windows Azure Storage API Documentation
  • Minio Cloud Storage Document

Source of the article : Wikipedia

Comments
0 Comments