Storage on AWS
Storage on AWS
Storage Types
- AWS storage services are grouped into three different categories: block storage, file storage, and object storage. File storage is the one that most of people might be familiar with. Like the one in the finder on MacOS.
- Block Storage:
Block storage is a type of data storage typically used in storage-area network (SAN) environments where data is stored in volumes, referred to as blocks. Each block acts as an individual hard drive, and the blocks are controlled by a server operating system. This block-level storage is flexible and fully supports transactions that involve frequent read and write operations.
For example, the operating system, software, and system files of your application would fit well with block storage because it’s frequently accessed and updated. It allows you to modify parts of the file directly at the block level, so you don’t need to retrieve the whole file to make changes. This makes it particularly efficient for handling complex, transaction-driven processes such as running databases or powering virtual machines.
In AWS, Amazon Elastic Block Store (EBS) provides block-level storage volumes that you can attach to EC2 instances. It’s suitable for workloads where data is accessed by single servers, and performance consistency and low-latency execution is vital.
Brief Summary:
Block storage is like a customizable closet with various drawers and compartments, where you can easily access and modify individual items without disturbing the rest, making it ideal for frequently updated and transaction-heavy data. - Object Storage:
Object storage treats each file as a complete object that is stored in a flat address space, as opposed to a file hierarchy. Each object includes the data, metadata, and a unique identifier. Object storage does not use a traditional file hierarchy, but instead, it puts all the objects in a flat address space. The unique identifier allows a server or end user to retrieve the object without needing to know the physical location of the data.
This is a better fit for storing static files or assets that are not frequently changed, like images, videos, or long-term backup files. For instance, in the context of your employee directory app, object storage would be a good choice for storing employee photos which will be retrieved often but modified rarely.
In AWS, Amazon Simple Storage Service (S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. It’s best suited for storing and retrieving large data files, and can be publicly accessible, making it a good fit for multimedia, backups, archives, data analytics, and more.
Brief Summary:
Object storage is like a large storage box where everything is stored together and retrieving or modifying a single item requires accessing the entire box, making it best for storing large, less frequently changed data like photos or videos. - Summary:
To summarize, block storage offers higher performance and is more flexible but typically comes at a higher cost and complexity than object storage. In contrast, object storage is simpler, more scalable, and more cost-efficient for storing large amounts of unstructured data but does not support in-place updates like block storage does. As with most architectural decisions, the best choice depends on the specific requirements of the application or use case.
EC2 Instance Storage and Elastic Block Store
- Instance Store:
Instance Store provides temporary block-level storage for Amazon EC2 instances. This storage is located on disks that are physically attached to the host computer and is ephemeral, meaning the data does not persist beyond the lifespan of the instance. Because of its direct attachment, the storage response time can be fast, making it suitable for temporary storage of information that changes frequently, like buffers, caches, scratch data, and other temporary content, or for data that is replicated across a fleet (group) of instances, such as a load-balanced pool of web servers.
However, a significant downside of Instance Store is that if an instance stops or is terminated, all data in the Instance Store is lost. This impermanence makes Instance Store unsuitable for long-term, persistent storage. - Amazon Elastic Block Store (Amazon EBS):
Amazon EBS provides persistent, block-level storage volumes for use with EC2 instances. These volumes act like an external drive that can be attached or detached to different instances within the same availability zone. An EBS volume is off-instance storage that persists independently from the life of an instance, making it ideal for data that must be quickly accessible and requires long-term persistence.
You can use EBS volumes as a primary storage for data that requires frequent updates, such as the system drive for an instance or storage for a database application. You can also use them for throughput-intensive and transaction-intensive workloads such as data warehousing and e-commerce applications.
Multiple EBS volumes can be attached to a single instance, but ordinarily, an EBS volume can only be attached to one instance at a time. That said, a feature called EBS Multi-Attach allows a single EBS volume to be concurrently attached to multiple instances within the same availability zone. - EBS Multi-Attach:
EBS Multi-Attach is a feature that allows an EBS volume to be attached to multiple EC2 instances within the same availability zone. Each attached instance has full read and write permissions to the shared volume. This feature is particularly useful for clustering scenarios and read-heavy workloads that benefit from concurrent read access to a dataset. - SSD-backed volumes:
SSD stands for Solid State Drive. Unlike Hard Disk Drives (HDD), SSDs have no moving parts. Instead, they use flash memory to store data. This makes SSDs faster and less prone to physical damage, leading to better reliability and performance. Amazon offers two types of SSD-backed EBS volumes:- General Purpose SSD (gp2 and gp3): They offer cost-effective storage for a broad range of workloads, including personal productivity applications, medium-sized databases, and development and test environments.
- Provisioned IOPS SSD (io1 and io2): They are designed for I/O-intensive applications such as large relational or NoSQL databases that need to perform fast read and write operations.
- HDD-backed volumes:
HDD stands for Hard Disk Drive. They use spinning disks to read and write data, which makes them slower than SSDs but also less expensive. Amazon offers two types of HDD-backed EBS volumes:- Throughput Optimized HDD (st1): They are ideal for workloads that require high throughput, such as big data, data warehousing, and log processing.
- Cold HDD (sc1): They provide inexpensive storage for infrequently accessed, throughput-oriented workloads, such as colder data (data that is infrequently accessed) that needs to be archived.
- EBS Snapshots:
An EBS Snapshot is a point-in-time copy of your data. It serves as a means to backup your EBS volumes. They are stored in Amazon S3, which is highly durable, and these backups can be used for disaster recovery, migration, and various other purposes. EBS snapshots are incremental, which means only the blocks of the volume that have changed after your most recent snapshot are saved in the new snapshot. This approach minimizes the storage space used by snapshots and reduces the time required to create a snapshot. If you need to restore your data, you can create a new EBS volume from a snapshot and attach it to an EC2 instance.
Easier explanation:
An EBS Snapshot is like a photo taken at a specific moment in time. It’s a picture of all the information you have stored in your EBS volume, like the files on your computer.
Just like you store your photos in a photo album, Amazon stores these snapshots in a service called Amazon S3. This service is very sturdy, meaning it keeps your snapshots safe, just like a sturdy photo album protects your photos.
Now imagine you accidentally delete some important files, or your computer crashes. You’d want to get your files back, right? In real life, if you lost a physical item, you’d wish you had a photo to remember it by. With Amazon’s snapshots, it’s even better – you can use the ‘photo’ (the snapshot) to recreate your lost files exactly as they were at the moment the snapshot was taken.
But what if you have tons of files and you take snapshots every day? Wouldn’t that take up a lot of space? Well, Amazon thought about that too. When you take a new snapshot, Amazon only saves the changes since your last snapshot. It’s like if you rearranged your room and took a new photo – you wouldn’t need to keep the old photo of the entire room, just the parts that changed.
And finally, when you need to get your files back from a snapshot, you can make a new EBS volume (like getting a new computer), and put all the data from the snapshot onto it. Then, you can attach this new volume to your EC2 instance (think of it as your online computer) and get back to work as if nothing had happened.
Object Storage with Amazon S3
- Downside of Using EBS: Amazon Elastic Block Store (EBS) is a block storage system used to store persistent data. It’s like a hard drive attached to your computer but in a cloud environment. It’s commonly used in conjunction with Amazon EC2 instances. However, there are some limitations:
- Limited Access: EBS volumes can only be attached to one EC2 instance at a time, making it difficult to share data across multiple instances. Some EBS volumes support multi-attach, but it’s not available for all volume and instance types.
- Size Limitations: EBS volumes have a maximum size limit. If your data exceeds this limit, you need to manage additional volumes. This makes it unsuitable for storing large amounts of data like high-definition photos.
- Advantage of Using S3: Amazon Simple Storage Service (S3) is an object storage service that offers scalability, data availability, security, and performance. It is designed to be a standalone storage solution not tied to compute resources.
- Scalability: S3 allows you to store as many objects as you want, with an individual object size limit of 5 terabytes. This makes it ideal for storing large files or a large number of files.
- Accessibility: Unlike EBS volumes, you can access your data in S3 from anywhere on the web using URLs, hence it is sometimes called “storage for the internet”.
- S3 is Distributed Storage: S3 stores your data across multiple different facilities within an AWS region. This redundancy ensures high availability and durability. In fact, S3 is designed for 99.99% availability and offers 11 9’s (99.999999999%) of durability.
- Bucket: In S3, you store your objects in containers called “buckets”. Before you can upload any object to S3, you need to create a bucket. You can then place your objects inside of these buckets. If you want to organize and arrange those objects, you can create folders inside the buckets. Although the bucket is region specific, meaning, we can choose the region where we want to place our bucket, the bucket name must be globally unique across all AWS accounts and DNS compliant.(Bucket is only placed on one region, bucket name should be globally unique)
- S3 Bucket Policies: S3 bucket policies are documents that define permissions about who can access the bucket and what actions they can perform. These policies are attached to buckets and can apply to every object in that bucket. For example, you can create a bucket policy that allows read-only permissions to anonymous viewers.
- Difference between IAM Policies and S3 Bucket Policies: Both IAM policies and S3 bucket policies use the same JSON-based policy language and can be used to manage access to S3 resources. However, there are differences in their scope and use.
- IAM Policies: These are attached to users, groups, and roles. They define what actions are allowed or denied on the resources for these users, groups, and roles.
- S3 Bucket Policies: These are attached directly to the S3 buckets. They define what actions are allowed or denied for the bucket and its objects. While the policy is attached at the bucket level and applies to the entire bucket, the rules within the policy can be specific to individual objects or sets of objects within the bucket. For example, a policy could include a rule that allows public access to a certain object, while denying public access to all other objects. However, the policy itself cannot be attached directly to individual objects or folders within the bucket.
- ENCRYPT S3: Server-side encryption (SSE) and client-side encryption are two major encryption methods used for data security and privacy, particularly in the context of cloud storage services like Amazon S3. The difference between the two approaches primarily lies in where the encryption and decryption processes occur and who is responsible for managing the encryption keys.
- Server-Side Encryption (SSE): With server-side encryption, your data (in this case, your object) is encrypted after it reaches Amazon S3 and before it is stored on the disks in Amazon’s data centers. Amazon S3 handles the encryption process for you; you don’t need to worry about managing the encryption keys or the encryption process itself. When you retrieve your data, Amazon S3 decrypts it for you.
There are several server-side encryption options available in Amazon S3:- SSE-S3: This provides an integrated solution where Amazon handles key management and key protection for you using keys that are unique to each object.
- SSE-KMS: This provides additional auditing and key usage controls via the AWS Key Management Service (AWS KMS).
- SSE-C: With this option, you provide the encryption key as part of your request, and Amazon S3 manages the encryption (as it writes to disks) and decryption (when you access your object).
- Client-Side Encryption: In contrast to server-side encryption, client-side encryption involves encrypting your data on your client side (for example, on your local machine) before sending it over to Amazon S3 for storage. This means that you’re responsible for managing the encryption process, the encryption keys, and all the related encryption tools. In essence, you’re in total control of the encryption of your data. When you retrieve your data from S3, it will still be in its encrypted form, and you’ll need to decrypt it on your end.
- Encryption during the transition: In terms of encrypting data in transit (as it moves to and from Amazon S3), you can use SSL/TLS, which secures the connection between your client and the S3 servers, ensuring that your data can’t be read or tampered with during transit. You can also use client-side encryption for this purpose, as your data would already be encrypted before it’s sent over the network.
- Amazon S3 Versioning: This is a powerful feature that allows you to maintain multiple versions of an object (which includes all writes and even deletes) in the same bucket. This can be invaluable for data recovery as you can restore previous versions of files in case of accidental deletion or overwriting, or even application failures.
Versioning works by assigning a unique version ID to each object uploaded to a bucket. If you upload a new object with the same key (name), instead of overwriting the existing object, S3 treats it as a separate version of the object. For example, you could have two objects with the same key, like employeephoto.gif, but with different version IDs (e.g., employeephoto.gif (version 111111) and employeephoto.gif (version 121212)).
Furthermore, deleting an object in a versioned bucket doesn’t permanently remove it. Amazon S3 puts a marker on the object indicating it was deleted, but the object still exists in the bucket. If you need to restore the object, you can remove the delete marker, and the object is restored. (It’s possible to permanently delete a specific version of an object if you specify the version ID in the DELETE request.) Buckets can be in one of three versioning states:- Unversioned (the default): In this state, no new or existing objects in the bucket have a version. If an object is uploaded with the same key as an existing object, it overwrites the existing object.
- Versioning-enabled: This state enables versioning for all objects in the bucket. Anytime an object is uploaded with the same key as an existing object, it’s treated as a new version rather than replacing the existing object. All versions of an object, including all writes and deletes, are kept.
- Versioning-suspended: This state suspends versioning for any new objects added to the bucket. Any new objects added to the bucket will not have a version, but all existing objects in the bucket will retain their versions. This allows you to stop accruing new versions for objects but does not affect existing versions.
- Amazon S3 Storage Classes: Amazon S3 Storage Classes are various types of data storage options provided by Amazon S3 to meet differing needs in terms of accessibility, pricing, and speed. When you upload a file or “object” to Amazon S3, you can choose a storage class that optimally meets your specific needs.
The available classes vary widely, from frequently accessed data (Amazon S3 Standard) to data that is rarely accessed and can tolerate retrieval latency (Amazon S3 Glacier and Glacier Deep Archive). There are also options for infrequently accessed data, with either high availability (Amazon S3 Standard-IA) or lower availability for cost savings (Amazon S3 One Zone-IA).
For data with unknown or changing access patterns, Amazon S3 Intelligent-Tiering automatically moves data between two access tiers (frequent and infrequent) based on usage patterns.
The Glacier storage classes are suited for long-term archiving where data is not expected to be accessed frequently. These options are much cheaper than the Standard and Infrequent Access classes, but they come with a longer retrieval time, ranging from a few minutes to several hours.
Lastly, Amazon S3 on Outposts delivers object storage capability to your on-premises AWS Outposts environment, helping you meet local data processing and data residency needs.
It’s crucial to note that pricing, availability, and durability can vary depending on the chosen storage class. For instance, while all classes provide high durability, the Standard class offers higher availability and faster access times compared to the Glacier classes. Similarly, the cost per gigabyte stored can be significantly lower for the Glacier classes compared to the Standard class, but the former might have additional retrieval costs.- Amazon S3 Standard: This is the default storage class for S3, providing general purpose storage suitable for a variety of use cases. It is ideal for cloud applications, dynamic websites, content distribution, mobile and gaming applications, and big data analytics. This class offers high durability and availability and delivers low latency and high throughput performance.
- Amazon S3 Intelligent-Tiering: This class is designed for data with unknown or changing access patterns. It has two access tiers: frequent and infrequent. Amazon S3 automatically monitors your data and moves it between these tiers based on how often you access the data. This helps optimize costs without performance impact or operational overhead.
- Amazon S3 Standard-Infrequent Access (S3 Standard-IA): This storage class is for data that isn’t accessed often but still needs to be retrieved quickly when needed. S3 Standard-IA offers the same durability, throughput, and low latency as S3 Standard, but at a lower per-GB storage price. There is a retrieval fee, so it’s ideal for long-term backups, disaster recovery files, and similar use-cases.
- Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA): While most S3 classes store data in a minimum of three Availability Zones (AZs), S3 One Zone-IA stores data in a single AZ and costs 20% less than S3 Standard-IA. It’s a good option for storing secondary backup copies of on-premises data or data that can be easily recreated.
- Amazon S3 Glacier Instant Retrieval: This archive storage class is the lowest-cost option for long-lived data that is rarely accessed but requires rapid retrieval when it is. It’s suitable for archiving data that might need to be accessed quickly.
- Amazon S3 Glacier Flexible Retrieval: S3 Glacier Flexible Retrieval is designed for archive data that is accessed one to two times per year and retrieved asynchronously. It offers up to 10% lower cost than S3 Glacier Instant Retrieval.
- Amazon S3 Glacier Deep Archive: This is Amazon S3’s lowest-cost storage class, designed for long-term data retention and digital preservation. It’s ideal for industries with regulatory compliance requirements that necessitate retaining data for 7 to 10 years or longer.
- Amazon S3 Outposts: Amazon S3 Outposts brings S3 object storage capabilities to your on-premises AWS Outposts environment. It’s useful for applications with local data processing and data residency requirements.
Further explanation of Amazon S3 Outposts:
Amazon S3 on Outposts is a feature that allows you to use S3 storage capabilities in your own on-premises environments, such as your own data centers, for data residency, local processing, data migration, or other specific requirements. It brings the agility, scalability, and reliability of S3 to workloads that need to remain on-premises due to low latency or data residency requirements.
AWS Outposts is essentially a fully managed service that extends AWS infrastructure, services, APIs, and tools to virtually any customer datacenter, co-location space, or on-premises facility for a truly consistent hybrid experience.
So while you’re using your own facilities (like servers and data centers) for storage, AWS manages the infrastructure and provides the same APIs and tools as in the AWS cloud, enabling a seamless integration between your on-premises environments and the AWS cloud. - Object Lifecycle Management: Object Lifecycle Management is a feature in Amazon S3 that enables automatic transitioning of objects between different storage classes or deletion of objects after a specified time period. This is particularly useful for managing costs and making sure that your data is stored in the most cost-effective manner possible, depending on its usage patterns.
A lifecycle policy consists of one or both of the following actions:- Transition actions: These define when objects should move from one storage class to another. This is generally done to save on storage costs, especially for data that is not accessed frequently. For instance, you may initially store data in the S3 Standard storage class for immediate access. However, if the data isn’t accessed frequently after the first 30 days, you could set a transition action to automatically move this data to the S3 Standard-IA or One Zone-IA storage classes, which are less expensive but still offer quick access when needed. After a year, if the data is unlikely to be accessed at all, you could set another transition action to move the data to S3 Glacier or Glacier Deep Archive, which are designed for long-term archival storage at very low costs.
- Expiration actions: These actions define when objects should be permanently deleted from Amazon S3. If your data has a defined life cycle, or you no longer need to keep certain data after a certain period of time, you can automate its deletion. For example, you could set an expiration action to automatically delete log files older than a year, or temporary files that are only needed for a day.
These two actions can be combined in various ways to create a data management strategy that optimizes costs and access needs.
For example, a common strategy for log files is to store them in S3 Standard for immediate access for a few days or weeks (for troubleshooting and analysis), then transition them to a cheaper storage class for some months (for less frequent but still possible access), and then automatically delete them after a year or so when they are no longer useful.
Another common use case is for documents or data that change in access frequency. If certain documents are frequently accessed for a limited period of time and then rarely accessed, you could set up a lifecycle policy to move these documents to a cheaper storage class after the peak access period, then archive them after a longer period, and finally delete them when they are no longer needed or required to be kept by organizational or regulatory rules.
In all cases, the aim is to balance the cost of storage against the need for access, and automate the process to reduce management overhead.
File storage on AWS
- Amazon Elastic File System (EFS): This is a fully managed Network File System (NFS) that can be mounted on multiple Amazon EC2 instances, allowing them to access the same set of files concurrently. It’s designed to be easy to use and provides a simple interface that allows you to create and configure file systems swiftly. EFS is a scalable file storage solution for use with Amazon Cloud services and on-premises resources. It scales on-the-fly without disrupting applications, growing and shrinking automatically as you add and remove files. It is suitable for a wide range of use cases, including content serving & sharing, enterprise applications, web serving, and more.
- Amazon FSx for Windows File Server: This is a fully managed native Microsoft Windows file system that can be accessed from up to thousands of compute instances using the Server Message Block (SMB) protocol. It’s built on Windows Server, delivering a wide range of administrative features such as user quotas, end-user file restores, and Microsoft Active Directory (AD) integration. FSx for Windows File Server is ideal for enterprise applications, home directories, software build environments, and any other workloads requiring file storage that is accessible over the SMB protocol.
- Amazon FSx for Lustre: This is a fully managed file system that is optimized for compute-intensive workloads, such as high-performance computing, machine learning, and media data processing workflows. Lustre is a popular open-source parallel file system that allows many clients to read and write to the same file concurrently. FSx for Lustre is integrated with Amazon S3, making it easier to process cloud data sets with supercomputing-class performance. After processing, the results can be written back to S3.
Some key similarities between these three services:
- They are all fully managed, meaning AWS handles the time-consuming administrative tasks for you, such as hardware and software maintenance, capacity planning, patching, backups, etc.
- They provide file storage capabilities, allowing multiple clients or instances to read and write to the same set of files concurrently.
- They operate on a pay-as-you-go model, meaning you only pay for the storage capacity you use, without having to provision storage in advance.
Questions
- What does it mean by “Object storage does not use a traditional file hierarchy”?:
Traditional file storage systems use a hierarchical structure, much like the folders on your computer. You have folders, which contain files, and those folders can have subfolders, each with more files. This is a method of organizing and locating files based on their position in this hierarchy.
On the other hand, object storage doesn’t use this hierarchical file structure. Instead, each piece of data (or ‘object’) is stored individually with a unique identifier, somewhat similar to how each book in a library has a unique call number. This unique identifier allows the data to be retrieved directly, no matter where it’s actually located.
This lack of hierarchy means that object storage scales very well (you don’t have to worry about running out of folders or having a complex folder structure) and you can access any piece of data directly without navigating through a folder structure. This makes it excellent for storing large amounts of data, but it may be less familiar to users who are used to hierarchical file systems. - Buffers:
In the context of computer systems, a buffer is a region of physical memory storage used to temporarily store data while it is being moved from one place to another. It’s a way of compensating for a difference in speed of data flow between the sending and receiving devices or processes. Buffers are typically used when there is a difference between the rate at which data is received and the rate at which it can be processed, or in the case where these rates are variable, for example, in a printer spooler or in streaming services.
Brief Summary:
A buffer provides a temporary holding place for the data while it is being processed or transferred. This allows the sending and receiving processes or devices to operate at different speeds or to handle other tasks simultaneously. - Caches:
A cache is a hardware or software component that stores data so that future requests for that data can be served faster. The data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhere. A cache hit occurs when the requested data can be found in a cache, while a cache miss occurs when it cannot. When the cache is full, the algorithm must choose which items to discard to make room for the new ones.
Caching allows you to efficiently reuse previously retrieved or computed data. For example, web browsers use caching to store copies of web pages, images, and other content on a user’s hard drive, so the next time the user visits those sites, the browser can load them from the cache rather than the original server, improving speed and saving bandwidth. - Scratch Data:
Scratch data is a form of temporary data storage used during computations where large amounts of data need to be written and read but only for a limited period. Once the process is complete, scratch data can be discarded. Scratch space is often used in high-performance computing environments or in data analysis scenarios where large datasets are processed and intermediate results are temporarily stored. The concept of “scratch” is a workspace for temporary use.
Brief Summary:
Scratch space is about providing temporary storage for data or calculations that are part of a larger process - Load-balanced pool:
Imagine a supermarket with several checkout lanes. These lanes are like servers in a computer network. Each lane (or server) can only handle one customer (or user request) at a time.
Now, let’s say it’s a busy day, and lots of customers are shopping. Some lanes might get more customers than others, leading to longer lines and wait times in those lanes. This is not an efficient way to handle customers because while some are waiting for a long time, others might be idle.
This is where the supermarket manager (or load balancer) comes in. The manager sees all the lanes and directs customers to the lanes that are less busy or idle. By doing this, the manager ensures that all lanes are utilized evenly, reducing the overall wait time for customers.
In a computer network, a load balancer works similarly. It sees all the servers (or EC2 instances in an AWS context) and routes user requests to servers that are less busy, ensuring all servers share the load equally. This way, no single server becomes a bottleneck due to too many requests, and overall system performance improves.
That’s a load-balanced pool in a nutshell: it’s a system that efficiently distributes workloads across multiple servers to ensure optimal performance and minimize overload on any single server. - Impermanence:
the state of not lasting or staying the same forever - System Drive for an Instance:
This refers to the primary storage device attached to a virtual or physical machine (an instance in the cloud computing context) that contains the necessary system files to boot and operate the system. It’s essentially the equivalent of the hard drive in your personal computer that has the operating system installed on it. When the machine starts up, it reads the operating system data from this system drive. In addition to the operating system, the system drive might also contain system-level applications, services, and related configuration files required for the system to function.
In the context of Amazon EC2 (Elastic Compute Cloud), when you launch an instance, it comes with a root volume (system drive), which contains the image used to boot the instance. Depending on the root device of the AMI (Amazon Machine Image), this root volume can either be an EBS (Elastic Block Store) volume or an instance store volume. - The statement “This feature is particularly useful for clustering scenarios and read-heavy workloads that benefit from concurrent read access to a dataset” means:
When you have a situation where you are using multiple servers (a cluster) to handle your operations, like a busy website or an application that many users are accessing at the same time, EBS Multi-Attach is beneficial.
With EBS Multi-Attach, all the servers in your cluster can read from the same EBS volume at the same time. This means they all can get the information they need without waiting for their turn, which makes everything run faster and more efficiently.
Also, for tasks that require a lot of data reading operations (read-heavy workloads), this feature is especially beneficial. When many parts of your system need to look up information from your dataset at the same time, EBS Multi-Attach allows them all to do so concurrently, leading to smoother and quicker operations.
So, in essence, EBS Multi-Attach helps in speeding up data reading operations, especially when multiple systems or parts of a system need to access the same data simultaneously. - For the root volume, you should select either instance store or Amazon EBS? But for the EC2 instance itself, you can use multiple instance stores and multiple EBS together?:
Yes, that’s correct. When you launch an Amazon EC2 instance, you must choose a root volume. This root volume can be either an Amazon Elastic Block Store (EBS) volume or an Instance Store volume, depending on the AMI (Amazon Machine Image) that you choose to launch your instance with.
The root volume contains the operating system that the EC2 instance runs, and is necessary for the instance to operate. If you use an EBS volume as the root volume, the data on the root volume will persist independently from the lifecycle of the instance itself. If you use an Instance Store volume as the root volume, the data on the root volume will cease to exist if the instance is stopped or terminated.
In addition to the root volume, you can attach multiple Amazon EBS volumes and use multiple instance store volumes (if your instance type supports them) for an EC2 instance. These additional volumes act as extra storage drives for your instance, similar to attaching an external hard drive to your laptop.
Remember that each type of storage has its own use cases, advantages, and drawbacks. Instance store volumes can provide high performance and low latency, but their data doesn’t persist if the instance is stopped or terminated. On the other hand, EBS volumes are persistent and versatile, but they might have higher latency compared to instance store volumes.