Saturday 18 October 2014

DB - distributed file systems - Thrift

Just use MySQL or Postgres. Why? If your entire _active set_ fits in a single machine's main memory (which can be as high as 128GB+ with modern commodity machines) you don't have a horizontal scalability problem: i.e., there is absolutely no reason for you to partition ("shard") your database and give up relations. If your active data set fits in memory most any properly tuned database with an index will perform well enough to saturate your Ethernet card before the database itself becomes a limitation.

If you decide the relational model itself isn't a good fit, you can easily build a "document oriented store" on top of MySQL: this is what Friendfeed ended up doing, I'd follow their model (except I'd use Avro (software), Apache Thrift, or Protocol Buffers instead of language-specific serialization) -- http://bret.appspot.com/entry/ho...

If your site becomes immensely successful, you will have an active set that no longer fits into your machine's main memory. In this case, an improperly designed storage engine's performance will fall off rapidly. MySQL's InnoDB (or Postgres's storage engine), however, will still allow you maintain (depending on your request distribution) a ~2:1-5:1 data to memory ratio with a spinning disk.  Once you've gone beyond that, performance begins to fall of rapidly (as you're making multiple disk seeks for every request). Now, your best course of action is just to upgrade to SSDs (solid state drives), which -- again -- allow you to saturate your Ethernet card *before* the database becomes a limitation.

Finally, when you get to data set size that doesn't fit on, e.g., several SSDs in a software raid 1+0 configuration (while giving you space for backups, multiple versions of data, etc...) then you have to scale horizontally. That is, you will have to use a database that intrinsically supports partitioning (e.g., Riak, Voldemort, Cassandra, HBase) or build an application-level partitioning layer on top of your MySQL/Postgres based data store. I can't tell you which solution is correct, as neither I (nor you) have any clue of what your data and its access patterns will be like at that point. That said, writing your own sharding layer is yet another place where you can introduce additional bugs into the code: not having to build your own distributed database (what you are effectively doing by building a sharding layer) is the major appeal of using an existing, scalable NoSQL database.

Note, how I am still not bringing the CAP theorem into play. The reason is that CAP itself has nothing to do with scalability, but everything to do with availability and handling of failures. What it means is that under certain failure scenarios (called Partitions, not to be confused with database partitioning!), you can not retain Availability and provide for linearizable Consistency at the same time. Linearizable consistency roughly corresponds to A and I in ACID. This has more to do with replication of a single entity (e.g., a row in a database) across multiple machines, with horizontal partitioning it's already difficult (for other reasons) to perform transactions between multiple entities in a database.

It's a common misconception that SQL databases "choose C" and "NoSQL" databases "choose A". In reality, I believe several SQL databases do *not* by default use the "serializable" transaction isolation level (choosing instead snapshot isolation) even on a single machine. When using MySQL's asynchronous replication, it's possible to be in a scenario where a master machine receives a write, allows readers to see the write, and then goes down *before* shipping this value to another replica -- i.e., losing serializable consistency when the other replica is read from (upon the master's failure).

At the mean time, many NoSQL database (e.g., HBase) do not actually provide "cap-A" Availability (in exchange for atomic mutation/compare-and-set operations, e.g., atomically incrementing a column within a row in HBase) or allow themselves to be configured (e.g., Voldemort or Riak configured to require strict read and write quorums) for consistency rather than availability (e.g., for applications such as counters).

There's also a hidden variable in CAP: latency. If you can simply re-try your operation until a new master node is elected or goes back online (which is usually fast as most failures are transient), you will, effectively have both high A-availability and C-consistency, as you can simply wait for the P-partition to be over (this time is called "MTTR"). That, obviously, isn't an option for large sites: users will click away if they wait too long for pages to load, money will be lost if items can't be added to shopping carts or ads can't be displayed. However, that isn't necessarily a concern when your traffic volumes aren't significant: again, this is a business decision.

Which CAP trade-offs do you choose? That, again, depends on your  application, and your data. You may note that many large applications  (e.g., complex websites) use a combination of the two (strong  consistency for some operations, highly availability for others),  depending on the business requirements.

(Note: I am grossly oversimplifying and speaking in the context of a single datacenter. When you have replication across a WAN, strong consistency becomes impractical (the latency costs are prohibitive) -- that's why, e.g., HBase supports log shipping to allow asynchronous replication to a remote site).

Summary: understand your data and application, and *then* plan for providing scalability and high availability for your data and application. If you're intellectually curious about distributed systems and issues like CAP et al, see the answers in What are the best resources for learning about distributed file systems?


---------------------

http://thrift.apache.org/ 
Thrift:
- has been open source for longer, so it has support for more languages.
- is lacking much documentation, though there are pretty good examples
- is slightly slower
- includes an RPC framework

Overall I'd generally use Thrift if the use case was to quickly make a server that can serve RPCs. If I was storing data in a log or a flat file and needed a serialization format, and was only going to one of the languages Protocol Buffers supported, then I'd use that.


----------------------



systems-oriented reading list in approximately chronological order:

* Design and Implementation of the Sun Network Filesystem - http://www.cs.ucsb.edu/~ravenben...
* Scale and Performance in a Distributed Filesystem - http://citeseerx.ist.psu.edu/vie...
* A Case for Redundant Arrays of Inexpensive Disks (RAID) - http://www.cs.cmu.edy/~garth/RAIDpaper/Patterson88.pdf
* Separating data and control transfer in distributed operating systems - http://portal.acm.org/citation.c...
* Zebra (and other research from Sprite) - http://www.eecs.berkeley.edu/Res...
* Disconnected Operation in the Coda Filesystem - http://www.cs.ucsb.edu/~ravenben...
* Frangipani: A Scalable Distributed Filesystem -
http://pdos.csail.mit.edu/6.824/... (and Petal: http://net.pku.edu.cn/~course/cs...)
* The Google File System - http://labs.google.com/papers/gf...
* CEPH: A Scalable, High-performance Distributed Filesystem - http://ceph.newdream.net/papers/...

You can't just dive in and read all the literature so you've got to choose an angle. Since (in my opinion) there's little fundamental theory that you have to learn before you can make sense of research filesystems, and much of the difficulty lies in the massive and interesting engineering challenges that scale and performance offer, it makes sense to read about the systems first and dive into the areas of theory you find interesting.
 

Thursday 6 March 2014

White Gold vs. Platinum

White Gold

White gold is a popular alternative to yellow gold, silver or platinum. Some people prefer the silver color of white gold to the yellow color of normal gold, yet may find silver to be too soft or too easily tarnished or the cost of platinum to be prohibitive. While white gold contains varying amounts of gold, which is always yellow, it also contains one or more white metals to lighten its color and add strength and durability. The most common white metals that form the white gold alloy are nickel, palladium, platinum and manganese. Sometimes copper, zinc or silver are added. The purity of white gold is expressed in karats, the same as with yellow gold.

White gold is an alloy of gold and some white metals such as silver and palladium. White gold can be 18kt, 14kt, 9kt or any karat. For example, 18kt yellow gold is made by mixing 75% gold (750 parts per thousand) with 25% (250 parts per thousand) other metals such as copper and zinc. 18kt white gold is made by mixing 75% gold with 25% other metals such as silver and palladium. So the amount of gold is the same but the alloy is different.

Is the color of white gold different to platinum?

If a white gold item has been rhodium plated (note: most white gold rings are rhodium plated) then the color difference will not really be noticeable at all. For example, the image on the left shows a rhodium plated 18kt white gold diamond engagement ring together with a platinum wedding ring. As you can see, there is essentially no difference in the metal color.

Then compare that with this picture on the left that shows the contrast in color between the natural colors of white gold and platinum. This men's white gold and platinum ring incoporates a combination of both platinum and white gold. The light sections are platinum, the darker sections are the natural color of white gold.

Platinum

Platinum is a white metal, but unlike gold it is used in jewelry in almost its pure form (approximately 95% pure). Platinum is extremely long wearing and is very white, so it does not need to be Rhodium plated like white gold.
Platinum
Platinum is very dense (heavy), so a platinum ring will feel heavier than an 18kt gold ring.
Platinum is, however, very expensive. A platinum ring will be approximately twice the price of an 18kt white gold ring (excluding gemstone costs).

For most jewelry white gold is going to do the same job as platinum. Platinum is heavier and a bit more durable. It can be fabricated better when making complex items than gold, but simple solitaire rings generally are not made in a way that platinum is totally required. Platinum is easier to hand engrave, but there is plenty of engraved white gold jewelery, too.

Platinum

Platinum

Platinum

Platinum

When not rhodium plated, white gold is a bit yellowish and platinum is more grayish in color. Rhodium on a ring is a short term plating anyway so it is only an initial "look".

People will almost always recommend platinum, if you can afford it. I basically agree, but I wear a nice 14kt white gold diamond ring every day and it is light weight and looks just fine....So the answer is really a matter of choice in most cases.

Monday 3 March 2014

(real social)^7

(rough pattern reminder to collect information )

(A) base object – (human) >> (i) men (ii) women --- (i)boys (ii) girls --- and children.

(B) sub objects (communication and reaction) ---

1) (with all possible positive reaction) buenos dias, Guten Morgen, good morning.

2) (formal and Informal communication) Hola, aló, jaló, bueno, al, diga Grüß dich! Grüß Gott! Guten Tag. ¿Cómo estás? ¿Cómo está? Entschuldigen Sie bitte! Entschuldigung! Pardon me? Wie bitte? Thank you.? Danke. I'm sorry.? Es tut mit leid. Really? Wirklich? Gladly! Gerne! Sehr erfreut. Mach's gut. Adiós Goodbye Auf Wiedersehen. Gracias ¿Cómo te va? ¿Cómo le va? ¿Qué tal? ¿Qué hay?¿Qué pasa? — ¿Qué hubo? ¿Qué onda?...

3) (Over past objects) ‘what do you think?’, ’what are you thinking?’, ‘what is your reaction?’, what do you want?

4) Thinking………… simply it starts with why and how

5) Relation of interest, nothing happens without any reason

| ------------- formal and Informal social/business relation

| (all possible objects between 1 to 6)

|

|

|

6) Guten Abend. Buenas noches

^0) past object >> videos, pictures and intellectual property

^0) (one must abide by Law) like/love/dislike/hate--almost connect connected to all sub object and base object

(C) places -----where people meet, part of introduction, college university parks club library and connected to 5

(D) Businesses and trade (social business concept)

(E) Knowledge

(F) Nash game theory, the page rank, Like Analysis, Democratic Analysis

…………………..

(G) Intelligent Search >> comprising all object A to E




(h) politics
(formal and Informal communication)

Friday 3 January 2014

Online and Outlet Store Project

List of Most Popular OS Shopping Cart

Open Source Shopping Cart Solution


Database for Shopping Cart

Open Source Shopping Cart Solution


Magento: It's built using the Zend Framework and it's heavily MVC. If you're familiar with the Zend Framework, this might be a good solution for you. It's a lot more difficult and has a much steeper learning curve than PrestaShop.
TomatoCart: This is a fork of osCommerce 3.0 beta. The creators have tried hard to improve the platform, but osCommerce has a lot of rotten design and it's a mess to extend. I've tried creating a couple of modules and had many problems. There isn't much documentation to help you and the community is not very active.
VirtueMart: If you are familiar with Joomla, this might be an option. If you're not, you should stay away because you might have a few problems setting everything up correctly and integrate this component with your template.
osCommerce: Just stay away from this...

OpenCart