Category: Data Science

Refactored A Finalist in SOLVE MIT Global Competition

Refactored AI: Innovative Solution from Colaberry Labs Emerges as Finalist in SOLVE @ MIT Global Competition Voting for the MIT SOLVE competition has ended, please visit http://training.colaberry.com for more exciting information about Colaberry Labs.    Refactored.ai: Innovative Solution from Colaberry Labs Emerges as Finalist in SOLVE @ MIT Global Competition

 

Toy Neural Network Classifies Orientation of Line

Visit Refactored to play with the interactive network. Artificial neural networks are abstractly inspired by the basic computational units of the brain, neurons. State-of-the-art solutions to AI problems including image recognition, natural language processing, simulating human creativity in the arts, and outperforming humans in complex board games are backed by artificial neural networks. Instead of …

 

Record Linkage in a Data Lake

Enterprises typically have various large data sets that are either in various enterprise systems, legacy systems, and/or dumped into big data lakes. With the exponential generation of data from numerous sources and continuous storage of this data in an inexpensive unstructured big data environment, “Record Linkage (RL)” is a huge challenge that all enterprises face …

 

Road to Data Science Hackathon At The University of Texas, Dallas

The Colaberry learning team in partnership with the data science club of the University of Texas, Dallas organized a data science hackathon on October 28th, 2017. It was one whirlwind tour of putting our data science learning platform https://refactored.ai that we are cooking in our labs to an interesting test in an uncontrolled environment. Here …

 

Logistic Regression with Titanic -ODSC 2017

At the Open Data Science Conference in Boston held on May 3rd, 2017, Colaberry presented an introductory workshop on Data Science with Python. This involved teaching Python and libraries needed for Data Science followed by Logistic Regression with Titanic. This was done with the help of our online self-learning platform: http://refactored.ai You can still sign …

 

Jensens Inequality hat Guarantees Convergence of EM Algorithm

Jensens Inequality that Guarantees Convergence of the EM Algorithm. Enjoy Colaberry blogs which cover this and many other cutting-edge topics. Jensen’s Inequality states that given g, a strictly convex function, and X a random variable, then, Here we shall consider a scenario of a strictly convex function that maps a uniform random variable and visualize …

 

Building An Efficient Pip​eline-AI Summit 2017, St. Louis

At the stamped con, AI Summit in St. Louis, Colaberry Consulting presented how to take a complex idea in the AI domain, apply ML algorithms to it, and deploy it in production using the refactored.ai platform. More details are available including code examples on the platform. This, as we see, is a common area of …

 

Why Bayesian Formulations Are Better Than Maximum Likelihood Estimates?

Find out why Bayesian formulations are a better option than maximum likelihood estimates for data analysis. Read our blog post for valuable insights and examples. Maximum Likelihood Maximum Likelihood Estimation (MLE) suffers from overfitting when the number of samples are small. Suppose a coin is tossed 5 times and you have to estimate the probability …

 

Data Science Predictive Analytics Pipeline

In this blog, we will discuss the data science predictive analytics pipeline and pointers to where you can learn more on the refactored.ai platform. Every organization or individual Data Scientist performs a set of tasks in order to run predictions on input datasets. At Colaberry we have extensive experience working with data science and predictive …

 

Data Science Explained

Data Science Data Science, also known as data-driven science, is an interdisciplinary field of scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to data mining. Data Science is a “concept to unify statistics, data analysis, and their related methods” to “understand and analyze …