Hi,I would like to do a paper review that is going to be short essay (3–4 pages) and double spaceon this article and each question should have a paragraph http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf F
Hi,I would like to do a paper review that is going to be short essay (3–4 pages) and double spaceon this article and each question should have a paragraph
http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf
Following these steps:
1.summary:Give a brief summary of the work by your own word
· What is the research problem the paper attempts to address?
· What are the claimed contributions of the paper?
· How do the authors substantiate their claims?
· What are the conclusions?
2. evaluating :
Evaluate the work by answering these question :
· Is the research problem significant?
· Are the contributions significant?
· Are the claims valid?
3.Synthesis : generate any interesting thought you have on the work by consulting the following question:
· What is the crux of the research problem?
· What are some alternative approaches to address the research problem?
· What is a better way to substantiate the claim of the authors?
· What is a good argument against the case made by the authors?
· How can the research results be improved?
· Can the research results be applied to another context?
· What are the open problems raised by this work?
· Bottom-line: Can we do better than the authors?
Spanner: Google’s globally-Distributed Database
Name:
Institution:
Date:
Summary of the work
Spanner is Google’s enveloping instrument or rather tool in organization of the globe’s monetizable informative sources. Reading this piece, there is a feeling of the paper being sharpened or chiselled in developing a great tool for Google. When going through this paper, a spark of revolution that could be traced back to what was caused by Bigtable. One fundamental applications of the Spanner is the fact that it uses and includes the GPS application and the famously known Atomic clock hardware. In most cases, projects only based on software are likely to succeed as compared to the other projects. With this development the only hope is that cloud could provide an enabling boosting hand and in addition provide quite highly valuable specialized services.
What is the research problem the paper attempts to address?
The work is generally set up to describe the manner in which Spanner is structured, the feature set. The reasoning and rationale that surrounds the various design decisions and the novel time API application that exposing the uncertainty of the clock.
What are the claimed contributions of the paper?
From the abstract, (ESA (Symposium), Fiat, & Sanders, 2009) Spanner is explained to be a Google’s globally accepted and distributed, multi-version, scalable and replicated database capable of being synchronised. The paper also claims that API and implementation are critically important in issues concerning the support external consistency and much more of the powerful features attached. The authors are very straight in explaining the applications that can make use of the Spanner for its high availability even in the wake of several natural disasters. Several interesting features are provided by scanner as the article suggests in the introductory remarks
How do the authors substantiate their claims?
.It is a known fact that Spanner is widely accepted and distributed database. The database offers replication and configurations for data and can be dynamically supported or rather controlled at a certain level of applications. At a high abstraction levels, the database is capable of sharding data across the many Paxos (James, Jeffrey, Michael, Andrew, Christopher, Sanjay, Dale, 2013), sets state machines in the various centers of data and spread all over the globe. Replication is useful in geographical locality and global availability with clients always failing over the replicas. Spanner is capable of automatically resharding the data over and across the machines with the change in server numbers and data amount.
Additionally it is capable of migrating data across various machines in balancing the load while responding to any imminent failures. Spanner is made in such a way that millions of machines across thousands of millions of datacentres and even much more of the database rows can be scaled up. Applications and applicants can make use of the Spanner for it is highly available by making a replication of their data internally or over and across other continents. The article authors mentions the initial customer to have been F1. F1 makes use of the five replicas with its applications widely spread across the boundaries of the United States.
As is clearly brought out in the introductory remarks of the article, the main focus of this important software application (Spanner) is to manage cross datacentres repeated or replicated data but a significant amount of time is also spent on implementation and design of the crucially important database characteristics and features in addition to the greatly distributed systems of infrastructure. On implementation, the article vehemently stands out with a well-orchestrated plan that involves directory abstraction basically employed in the management of locality and replication and forms the data movement unit. The implementation also voices its explanation on the description of the model of the data to be used. (ESA (Symposium), Fiat, & Sanders, 2009), the deployment of Spanner is described as global with the system managing global data. As a result of these, the running universes are likely to be just a handful. The article mentions that the Spanner is organized around zones sets with each of these zones being a rough analog of the deployment of Bigtable servers.
What are the conclusions?
As concluding sentiments, the paper explains that spanner makes a combination and an extension of the ideas from two communities of research, and that is, the database community and systems community. There is a particular aspect of the design presented in the article that automatically makes an edge that is the linchpin of its feature of the True Time. With the reification of clock uncertainty within desirable time limits, the API is capable of making possibly built systems well distributed even with better and stronger semantics.
Evaluation of the work, its significance, contribution and validity of the claims presented.
Is the research problem significant?
The research question or problem, which entirely explains the role of Spanner in database systems is very vital and significant. We cannot underscore the role played by the authors in bringing this out. As opposed to the Bigtable, the database offered by Spanner is capable of assigning timestamps to data systems which brings out the multi-version database valuable to key values stores. The system is agreed to support replication. However, the complexity with which the flexible replication of configurations made the team abandon the whole issue. This is the basic reason which brings out a weakness in the evaluation process. Despite the fact that the team might have had complexities in this achievement, they should not have abandoned the same. It rebuilds an application of a failed data system and therefore obsolete. Each state is made capable of storing its metadata and subsequently logging into the corresponding data tablet. The other big advantage of this system is the capability of supporting the long-lived leaders with those who are time based with the length defaulting to about ten seconds.
Are the contributions significant?
Contributions arising from the deliberations in the article are very significant especially to the addressees. We look at these contributions step by step. The current state of its implementation logs the Paxos at about two times a piece. The choice was rather expedient in the manner that its remedy was an eventuality. The systems also made quite a progress in the implementation of the Paxos being rather pipelined and therefore this proves an avenue for throughput to be improved using the WAN latencies. The Paxos therefore are useful in the state machines in the implementation of consistently replicated mapping bags. This is then stored in the next tablet, also called the replica of the corresponding tablet. The article is greatly significant since the claims made are valid and explanations sensible.
Another key feature mentioned in the Spanner implementation is the ability to support the bucketing abstraction called the directory. The directory is considered a contiguous keys capable of sharing common prefixes. The supporting directories are essential, it allows the applications to hugely control the data locality by carefully choosing the desired keys. On availability, the article provide quite a stunning presentation of the available benefits of having the (Srinivasa, Jayalakshmi, & Srinivasan, 2015), Spanner run in multiple and various datacenters.
Are the claims valid?
Most of the claims made in this text are as valid since most of them are backed with extensive secondary data and practical applications. Let us discuss some of these numerous practical approaches and well researched avenues. With three experiments done on throughput of data failure laid all through similar time scale, the team was able to come up viable solutions to this whole issue. Another area of evaluation that was researched quite extensively was the whole area of micro benchmarks. The team conducted experiments on Spanner micro benchmarks of whose measurements were conducted on timeshared machines. Separate machines were used in running the clients. Each zone was made to have a single spanserver.
On the overall, concerning the latency experiments, to avoid the problems of querying at the servers, few operations were issued to the clients. It is established that with increasing latency, the replicas stayed constant, but this was prone to change especially with a constant latency with changing replicas. On True Time, two fundamental questions were addressed by the article, firstly on whether it is bound on time uncertainty and how bad it can actually get. The paper makes quite substantive explanations and well-reasoned out arguments on the research problem proving quite useful. The claims made in the above paragraphs are well explained with detailed explanations especially with examples drawn from the side work, that is, the reference material and the future work.
Synthesis: crux of the research problem, alternative approaches and better ways to substantiate claims made by researchers.
What is the crux of the research problem?
The crux of the research problem, which is basically the decisive or most significant point at issue being addressed is the description of the structure of the Spanner. Google makes a description of the Spanner not to be purely relational system of database since each table must poses a primary key column. There have been complaints in lacking transactions in the Big Table from the common users and this is the main thing being addressed by the article.
What are some alternative approaches to address the research problem?
The following paragraph explains alternative approaches to exploring the research problem. The data model given by the Spanner database is not entirely relational. The rows must be given the names. More concisely, all the tablets is needed or rather required to have the ordered set of a single or more primary key rows and columns. However, this is the field in which the spanner database offers the primary key value stores which are basically important in the mapping arrangement. With the glowing attributes and the masterpiece presented here, more alternative methods were possibly capable of being included. The subject matter being addressed by this team of researchers is quite important especially to technological experts and software engineers.
The team makes it clear that they have worked with the F1 team for some time. Some clients have obviously not and therefore this presents an opportunity for criticism (Corbett, Dean, Epstein, Fikes, Frost, Furman, Ghemawat, Hochschild, 2013). Some of the jargons used in this article may seem rather complex to the Leigh man reader of the same. The values of the scope as served by those single Paxos ensures that the client information and issues are kept in the read only transaction model therefore affecting the group leadership rather. The text does not explain in detail the whole issues of the Paxos servers which is considered a negative feature for the understanding the whole concept explained in the article. Another disadvantage or area of lacking is the manner in which the schema change transactions is explained (James, Jeffrey, Michael, Andrew, Christopher, Sanjay, Dale, 2013). The TrueTime application and software enables the Spanner to have the atomic schema changes affected. Using the standard transaction is obviously infeasible in this type of application since the number of applicants or participants making use of the database are well above their millions.
What is a better way to substantiate the claim of the authors?
There is a possibility that the BigTable application is capable of supporting the atomic schema variations and changes in a single datacenter despite the fact that its changes is capable of blocking all possible operations (Corbett, Dean, Epstein, Fikes, A., Frost, C., Furman, Ghemawat, Hochschild, 2013).The team mentions that they are currently in the implementation face of Spanner schema system of language. Secondly, under the automatic maintenance of its tertiary indices which is an automatic resharding system based on loading basis. The team also wishes to implement some features likely to be vest gated. So from an optimistic point of view, doing the reads in a paralleled manner should have been an important if not valuable and usable strategy in pursing this line. Despite the possibility of following this path, there are possibilities of implementing this plan being non trivial.
References
Corbett, J. C., Dean, J., Epstein, M., Fikes, A., Frost, C., Furman, J. J., Ghemawat, S., … Hochschild, P. (January 01, 2013). Spanner: Google’s Globally Distributed Database. Acm Transactions on Computer Systems, 31, 3.)
ESA (Symposium), Fiat, A., & Sanders, P. (2009). Algorithms, ESA 2009: 17th annual European symposium, Copenhagen, Denmark, September 7-9 : proceedings. Berlin: Springer.
James, C. C., Jeffrey, D., Michael, E., Andrew, F., Christopher, F., J, J. F., Sanjay, G., … Dale, W. (August 01, 2013). Spanner: Google’s Globally Distributed Database. Acm Transactions on Computer Systems (tocs), 31, 3, 1-22.
Srinivasa, K. G., Jayalakshmi, D. S. K. G., & Srinivasan, R. G. (January 01, 2015). Data Intensive Cloud Computing.