Linkedin said on Tuesday that it open sourced a framework called Cubert that uses specialized algorithms to organize data in a way that makes it easier to run queries without overburdening the system and wasting CPU resources.
Cubert, whose name is derived from the Rubik’s Cube, is supposedly as easy for engineers to work with as a Java application and it contains a “script-like user interface” from which engineers can use algorithms like MeshJoin and Cube on top of the organized data to save system resources when running queries.
From the LinkedIn blog post:
[blockquote person=”LinkedIn” attribution=”LinkedIn”]Extant engines such as Apache Pig, Hive and Shark (orange blocks) provide a logical declarative language that is translated into a physical plan. This plan is executed on the distributed engine (Map-Reduce, Tez or Spark), where the physical operators are executed against the data partitions. Finally, the data partitions are managed via the…
Ver la entrada original 262 palabras más