Amazon Neptune Interview Questions
1. What is Amazon Neptune?
Amazon Neptune is a fully managed graph database service offered by Amazon Web Services (AWS). It allows users to store and query graph data, which is data that is connected in a network or web of relationships. Neptune is designed to be highly scalable and performant, and it supports popular graph query languages such as Gremlin and SPARQL. It is often used in applications such as recommendation engines, fraud detection, and social network analysis.
2. What popular graph query languages does Amazon Neptune support?
Amazon Neptune supports popular graph query languages such as Gremlin and SPARQL. Gremlin is a graph traversal language designed to be used with graph databases, and it allows users to query and manipulate data stored in a graph. SPARQL is a query language for retrieving and modifying data stored in RDF (Resource Description Framework) formats.
3. Can I use Apache TinkerPop Gremlin and RDF/SPARQL on the same Neptune instance?
Yes, each Neptune instance provides both a Gremlin Websocket Server and a SPARQL 1.1 Protocol REST endpoint. The data is separated between the stacks, i.e. you cannot execute a Gremlin traversal over RDF data or vice-versa. This is intended to allow you to experiment and see which one is the best fit for your application. In production, as the resources are shared on a single instance, we recommend clients access an instance via only one of the languages, i.e. Gremlin or SPARQL.
4. How can I migrate from an existing Apache TinkerPop Gremlin application to Amazon Neptune?
Amazon Neptune provides an Apache TinkerPop Gremlin Server that supports both Websocket and REST connections. Once you provision an instance of Amazon Neptune, you can configure your existing TinkerPop application to use the endpoint provided by the service. See also Accessing the Graph via Gremlin.
5. Do I need to change client drivers to use Amazon Neptune’s Gremlin Server?
No, Amazon Neptune’s Gremlin server supports clients that are compatible with Apache TinkerPop version 3.3 using both Websocket and REST connections.
6. How can I migrate from a triple store with a SPARQL endpoint to Amazon Neptune?
Amazon Neptune provides an HTTP REST endpoint that implements the SPARQL 1.1 Protocol. Once you provision a service instance, you can configure your application to point to the SPARQL endpoint. See also Accessing the Graph via SPARQL.
7. Do I need to change client drivers to use Amazon Neptune’s SPARQL Endpoint?
No, Amazon Neptune’s SPARQL endpoint will work with any client that supports the SPARQL 1.1 Protocol.
8. Is Neptune ACID (Atomicity, Consistency, Isolation, Durability) compliant?
Yes, Amazon Neptune is ACID compliant. ACID stands for Atomicity, Consistency, Isolation, and Durability, and it refers to a set of properties that ensure that database transactions are processed reliably.
9. Why are Amazon RDS permissions and resources required to use Amazon Neptune?
Amazon Neptune is a purpose-built, high-performance graph database engine. For certain management features such as instance lifecycle management, encryption-at-rest with Amazon Key Management Service (KMS) keys, and security groups management, Neptune leverages operational technology that is shared with Amazon RDS.
10. Does Amazon Neptune have a service level agreement (SLA)?
Yes, Amazon Neptune does have a service level agreement (SLA). The SLA for Amazon Neptune guarantees a certain level of availability for the service, and if the service fails to meet this availability, AWS will provide credits to the customer’s account. The specific terms of the SLA, including the availability targets and credit provisions, can vary depending on the specific service plan and region that the customer is using. It is important to review the SLA carefully to understand the guarantees and limitations of the service.
Questions on AWS Neptune Performance
11. What types of graph query workloads are optimized to work with Amazon Neptune?
Amazon Neptune is designed to support graph applications that require high throughput and low latency graph queries. With support for up to 15 read replicas, Amazon Neptune can support 100,000s queries per second.
12. Does Amazon Neptune perform query optimization?
Yes, Amazon Neptune does perform query optimization. Query optimization is the process of making a query run more efficiently and faster by identifying and addressing bottlenecks or inefficiencies in the query execution plan. Amazon Neptune uses various techniques to optimize queries, including indexing, materialized views, and other performance-enhancing features. These optimization techniques can help to improve the performance of queries and reduce the time and resources required to execute them.
13. Is Amazon Neptune built on a relational database?
No, Amazon Neptune is a purpose-built, high-performance graph database engine. Neptune efficiently stores and navigates graph data, and uses a scale-up, in-memory optimized architecture to allow for fast query evaluation over large graphs.
Questions on AWS Neptune Pricing
14. How much does Amazon Neptune cost?
Click hear to see Amazon Neptune pricing page for current pricing information.
https://www.amazonaws.cn/en/neptune/pricing/
15. In which AWS regions are Amazon Neptune available?
Amazon Neptune is available in the following AWS regions:
- US East (N. Virginia)
- US East (Ohio)
- US West (Oregon)
- EU (Ireland)
- Asia Pacific (Tokyo)
- Asia Pacific (Seoul)
- Asia Pacific (Mumbai)
It is important to note that the availability of Amazon Neptune and other AWS services can vary by region, and some regions may have more limited availability or functionality. It is always best to check the AWS region table for the latest information on service availability.
16. Amazon Neptune replicates each chunk of my database volume six ways across three Availability Zones. Does that mean that my effective storage price will be three or six times what is shown on the pricing page?
No, Amazon Neptune replicating each chunk of your database volume six ways across three Availability Zones does not mean that your effective storage price will be three or six times what is shown on the pricing page. The purpose of replicating the database volume across multiple Availability Zones is to increase the availability and durability of the data, not to increase the storage capacity or cost. The storage cost for Amazon Neptune is based on the amount of data stored in the database, not on the number of replicas or Availability Zones. Therefore, the storage cost will not be multiplied by the number of replicas or Availability Zones.
17. What are IOs in Amazon Neptune and how are they calculated?
Amazon Neptune was designed to eliminate unnecessary IO operations in order to reduce costs and ensure resources are available for serving read/write traffic. Write IOs are only consumed when pushing transaction log records to the storage layer for the purpose of making writes durable. Write IOs are counted in 4KB units. For example, a transaction log record that is 1024 bytes will count as one IO operation. However, concurrent write operations whose transaction log is less than 4KB can be batched together by the Amazon Neptune database engine in order to optimize I/O consumption. Unlike traditional database engines, Amazon Neptune never pushes modified database pages to the storage layer, resulting in further IO consumption savings.
Questions on Hardware and Scaling
18. What are the minimum and maximum storage limits of an Amazon Neptune database?
The minimum storage is 10GB. Based on your database usage, your Amazon Neptune storage will automatically grow, up to 64 TB, in 10GB increments with no impact on database performance. There is no need to provision storage in advance.
19. How do I scale the compute resources associated with my Amazon Neptune DB Instance?
You can scale the compute resources allocated to your DB Instance in the AWS Management Console by selecting the desired DB Instance and clicking the Modify button. Memory and CPU resources are modified by changing your DB Instance class.
When you modify your DB Instance class, your requested changes will be applied during your specified maintenance window. Alternatively, you can use the “Apply Immediately” flag to apply your scaling requests immediately. Both of these options will have an availability impact for a few minutes as the scaling operation is performed. Bear in mind that any other pending system changes will also be applied.
Questions on AWS Neptune Backup and Restore
20. How do I enable backups for my DB Instance?
To enable backups for your DB Instance in Amazon Neptune, you can follow these steps:
- Open the Amazon Neptune console and navigate to the “DB Instances” page.
- Select the DB Instance that you want to enable backups for.
- On the “Instance Actions” menu, select “Modify”.
- On the “Modify DB Instance” page, scroll down to the “Backup” section.
- Select the “Backup retention period” that you want to use. This determines how long backups will be kept for the DB Instance.
- Select the “Preferred backup window” that you want to use. This determines the time of day when backups will be taken.
- Click “Continue” to review the changes, and then click “Modify DB Instance” to apply the changes.
Once backups are enabled, Amazon Neptune will automatically create and store backups of your DB Instance according to the retention period and preferred backup window that you have specified. You can use these backups to restore the DB Instance to a previous state in case of data loss or corruption. It is important to note that enabling backups may incur additional charges, depending on the backup options and retention period that you choose.
21. Can I take DB Snapshots and keep them around as long as I want?
Yes, you can take DB Snapshots and keep them around as long as you want in Amazon Neptune. A DB Snapshot is a point-in-time copy of a DB Instance, and it can be used to restore the DB Instance to the state it was in at the time the snapshot was taken. By default, Amazon Neptune retains all DB Snapshots until you explicitly delete them, so you can keep them around for as long as you need.
22. If my database fails, what is my recovery path?
Amazon Neptune automatically maintains 6 copies of your data across 3 Availability Zones and will automatically attempt to recover your database in a healthy AZ with no data loss. In the unlikely event, your data is unavailable within Amazon Neptune storage, you can restore from a DB Snapshot or perform a point-in-time restore operation to a new instance. Note that the latest restorable time for a point-in-time restore operation can be up to 5 minutes in past.
23. What happens to my automated backups and DB Snapshots if I delete my DB Instance?
You can choose to create a final DB Snapshot when deleting your DB Instance. If you do, you can use this DB Snapshot to restore the deleted DB Instance at a later date. Amazon Neptune retains this final user-created DB Snapshot along with all other manually created DB Snapshots after the DB Instance is deleted. Only DB Snapshots are retained after the DB Instance is deleted (i.e., automated backups created for point-in-time restore are not kept).
24. Can I share my snapshots with another AWS account?
Yes. Neptune gives you the ability to create snapshots of your databases, which you can use later to restore a database. You can share a snapshot with a different AWS account, and the owner of the recipient account can use your snapshot to restore a DB that contains your data. You can even choose to make your snapshots public – that is, anybody can restore a DB containing your (public) data. You can use this feature to share data between your various environments (production, dev/test, staging, etc.) that have different AWS accounts, as well as keep backups of all your data secure in a separate account in case your main AWS account is ever compromised.
25. Will I be billed for shared snapshots?
There is no charge for sharing snapshots between accounts. However, you may be charged for the snapshots themselves, as well as any databases you restore from shared snapshots. Learn more about Amazon Neptune pricing.
26. Can I automatically share snapshots?
We do not support sharing automatic DB snapshots. To share an automatic snapshot, you must manually create a copy of the snapshot and then share the copy.
27. How many accounts can I share snapshots with?
You may share manual snapshots with up to 20 AWS account IDs. If you want to share the snapshot with more than 20 accounts, you can either share the snapshot as public or contact support to increase your quota.
28. In which regions can I share my Amazon Neptune snapshots?
You can share your Neptune snapshots in all AWS regions where Amazon Neptune is available.
29. Can I share my Amazon Neptune snapshots across different regions?
No. Your shared Amazon Neptune snapshots will only be accessible by accounts in the same region as the account that shares them.
30. Can I share an encrypted Amazon Neptune snapshot?
Yes, you can share an encrypted Amazon Neptune snapshot with other AWS accounts or make it public. To share an encrypted snapshot, you can follow these steps:
- Open the Amazon Neptune console and navigate to the “Snapshots” page.
- Select the snapshot that you want to share and click the “Actions” button.
- From the “Actions” menu, select “Share Snapshot”.
- On the “Share Snapshot” page, enter the AWS account IDs of the accounts that you want to share the snapshot with. Alternatively, you can select the “Public” option to make the snapshot publicly available.
- Click “Share Snapshot” to share the snapshot.
31. Can I use Amazon Neptune snapshots outside of the service?
Yes, you can share an encrypted Amazon Neptune snapshot with other AWS accounts or make it public. To share an encrypted snapshot.
You can follow these steps:
- Open the Amazon Neptune console and navigate to the “Snapshots” page.
- Select the snapshot that you want to share and click the “Actions” button.
- From the “Actions” menu, select “Share Snapshot”.
- On the “Share Snapshot” page, enter the AWS account IDs of the accounts that you want to share the snapshot with. Alternatively, you can select the “Public” option to make the snapshot publicly available.
- Click “Share Snapshot” to share the snapshot.
Questions on AWS Neptune’s High Availability and Replication
32. How does Amazon Neptune improve my database’s fault tolerance for disk failures?
Amazon Neptune automatically divides your database volume into 10GB segments spread across many disks. Each 10GB chunk of your database volume is replicated six ways, across three Availability Zones. Amazon Neptune is designed to transparently handle the loss of up to two copies of data without affecting database write availability and up to three copies without affecting read availability. Amazon Neptune storage is also self-healing. Data blocks and disks are continuously scanned for errors and repaired automatically.
33. How does Amazon Neptune improve recovery time after a database crash?
Unlike other databases, after a database crash, Amazon Neptune does not need to replay the redo log from the last database checkpoint (typically 5 minutes) and confirm that all changes have been applied, before making the database available for operations. This reduces database restart times to less than 60 seconds in most cases. Amazon Neptune moves the buffer cache out of the database process and makes it available immediately at restart time. This prevents you from having to throttle access until the cache is repopulated to avoid brownouts.
34. What kind of replicas does Neptune support?
Amazon Neptune supports Read Replicas, which share the same underlying volume as the primary instance. Updates made by the primary are visible to all Amazon Neptune Replicas.
Feature | Value of Amazon Neptune Replicas |
---|---|
Number of replicas | Up to 15 |
Replication Type | Asynchronous (milliseconds) |
Performance impact on primary | Low |
Act as failover target | Yes (no data loss) |
Automated failover | Yes |
35. Can I have cross-region replicas with Amazon Neptune?
Yes, you can have cross-region replicas with Amazon Neptune. Cross-region replicas are copies of your database that are stored in a different AWS region from the primary database. This can be useful for disaster recovery, data backup, and global distribution of your database.
36. Can I prioritize certain replicas as failover targets over others?
Yes. You can assign a promotion priority tier to each instance on your cluster. When the primary instance fails, Amazon Neptune will promote the replica with the highest priority to the primary. If there is contention between 2 or more replicas in the same priority tier, then Amazon Neptune will promote the replica that is the same size as the primary instance.
37. Can I modify priority tiers for instances after they have been created?
Yes, you can modify the priority tiers for instances after they have been created in Amazon Neptune. Priority tiers are used to determine the order in which replicas are promoted to the primary role when the primary database becomes unavailable. The higher the priority, the more likely the replica is to be promoted to the primary role.
38. Can I prevent certain replicas from being promoted to the primary instance?
Yes, you can prevent certain replicas from being promoted to the primary instance in Amazon Neptune. To do this, you can set the priority of the replicas to a lower value than the other replicas. The lower the priority, the less likely the replica is to be promoted to the primary role.
39. How can I improve upon the availability of a single Amazon Neptune database?
You can add Amazon Neptune Replicas. Amazon Neptune Replicas share the same underlying storage as the primary instance. Any Amazon Neptune Replica can be promoted to become primary without any data loss and therefore can be used for enhancing fault tolerance in the event of a primary DB Instance failure. To increase database availability, simply create 1 to 15 replicas, and Amazon Neptune will automatically include them in the failover primary selection in the event of a database outage.
40. What happens during failover and how long does it take?
Failover is automatically handled by Amazon Neptune so that your applications can resume database operations as quickly as possible without manual administrative intervention.
If you have an Amazon Neptune Replica, in the same or a different Availability Zone, when failing over, Amazon Neptune flips the canonical name record (CNAME) for your DB primary endpoint to a healthy replica, which is in turn is promoted to become the new primary. Start-to-finish, failover typically completes within 30 seconds. Additionally, the read replicas endpoint doesn’t require any CNAME updates during failover.
If you do not have an Amazon Neptune Replica (i.e. single instance), Neptune will first attempt to create a new DB Instance in the same Availability Zone as the original instance. If unable to do so, Neptune will attempt to create a new DB Instance in a different Availability Zone. From start to finish, failover typically completes in under 15 minutes.
Your application should retry database requests in the event of connection loss.
41. If I have a primary database and an Amazon Neptune Replica actively taking read traffic and a failover occurs, what happens?
Amazon Neptune will automatically detect a problem with your primary instance and begin routing your read/write traffic to an Amazon Neptune Replica. On average, this failover will complete within 30 seconds. In addition, the read traffic that your Amazon Neptune Replicas were serving will be briefly interrupted.
42. How far behind the primary will my replicas be?
In Amazon Neptune, replicas are typically updated with changes from the primary database in near real-time, with a delay of just a few milliseconds. This means that the replicas are usually only a few milliseconds behind the primary, and they are generally kept in sync with the primary database at all times.
However, it is possible that the replicas may be slightly further behind the primary under certain circumstances, such as during periods of high activity or network congestion. In general, the replicas are designed to be as close to the primary as possible, but it is important to understand that there may be some latency between the primary and the replicas.
Questions on AWS Neptune Security
43. Can I use Amazon Neptune in Amazon Virtual Private Cloud (Amazon VPC)?
Yes, all Amazon Neptune DB Instances must be created in a VPC. With Amazon VPC, you can define a virtual network topology that closely resembles a traditional network that you might operate in your own data center. This gives you complete control over who can access your Amazon Neptune databases.
44. Does Amazon Neptune support encrypting my data in transit and at rest?
Amazon Neptune supports HTTPS-encrypted client connections and also allows you to encrypt your databases using keys you manage through AWS Key Management Service (KMS). On a database instance running with Amazon Neptune encryption, data stored at rest in the underlying storage is encrypted, as are its automated backups, snapshots, and replicas in the same cluster. Encryption and decryption are handled seamlessly. For more information about the use of KMS with Amazon Neptune, see the Amazon Neptune User Guide.
45. Can I encrypt an existing unencrypted database?
Currently, encrypting an existing unencrypted Neptune instance is not supported. To use Amazon Neptune encryption for an existing unencrypted database, create a new DB Instance with encryption enabled and migrate your data into it.
46. How do I access my Amazon Neptune database?
Access to Amazon Neptune databases must be done through the HTTP port entered on database creation within your VPC. This is done to provide an additional layer of security for your data. Step-by-step instructions on how to connect to your Amazon Neptune database is provided in the Amazon Neptune User Guide.