Analytics shopping Trolling Through the Trolleys

The retail sector is on the cusp of radical transformation and analytics is the engine for change. Analytics has become the differentiator in a sector that was badly hit by the recession. If there is no intelligent analytics coming out of the data that is collected, there is no way that management can make informed decisions. There are hundred of thousands of “data points” in retail, more than any other sector because of the nature of the business. These data points are stock keeping units, each one accounted for on the overall merchandising plan, being able to identify fast moving items is the key to profitability.

Comparative Street Names on Versions of Monopoly Boards

I suppose a store is like a monopoly board where Shrewsbury and Aylesbury road are the most visited aisles and Crumlin gets less footfall. You need to put the highest margin best sellers at the biggest traffic points. While retailers will have a good grasp of what is hot and not in their product lines, the devil really is in the detail. TRC solutions who deliver point- of- sale and business management software to retail say that understanding the sales performance of the middle 65-70 per cent of units is the hard part and that’s where analytics make the difference. It’s about informing decisions for an attrition programme to drum out non- performers or add new merchandise.

Another huge area of retail that benefits from better visibility is discounts, marking down the right items at the right time to ensure a timely turnover of stock. Without analytics around the mark down and sales, a fashion retailer, for example, risks ending up with out- of- season stock they couldn’t give away. According to Gavin Peacock who is CEO of TRC solutions, tier one retailers appear to be constantly having sales, because they have identified what isn’t moving and likely to cause a block if it isn’t sold off to make way for the next range. The profitable retailers buy properly and smartly in cycles, they wouldn’t have a clue how to do it without analytics.

TRC with over 20 years of experience provides a range of products such as Tableau and Qlik, as well as more specialised retail applications. A natural consequence of stores generating more raw data- through electronic point of sale systems and accounting packages- is growing interest in analytics. The recession also played a massive part, from 2008-2012 all but the most disciplined retailers were in survival mode and they were under pressure to become more efficient. A number of banks  called on their retail clients to use analytics as a condition for survival, why you might ask?, well reduced footfall, tighter margins and the decline in the average consumer spending have encouraged them to drill down into data to find ways to do more with less. Now with signs of recovery, analytics promises to take retail to a whole other level.

TRC solutions is launching a product very soon that plugs in to an always- on RFID(radio frequency ID) device which will be capable of reading 5000 items, 100 times a minute and streaming the data to the cloud to be analysed. It is amazing to think you can tell how many times an item is picked up before it is bought or how many times a garment is tried on before it is purchased, and how quickly staff get items back on the racks. We really do live in a world were flat, obsolete and out-of-date reporting has gone. The new way is about having trigger points for real intelligence built in across the entire enterprise.


... can be Very confusing. Help people out with lettered parking signs

Beyond Volume, Variety and Velocity is the Issue of Big Data Veracity

Big data incorporates all the varieties of data, including structured data and unstructured data from e-mails, social media, text streams, and so on. This kind of data management requires companies to leverage both their structured and unstructured data.Big data enables organizations to store, manage, and manipulate vast amounts of disparate data at the right speed and at the right time. To gain the right insights, big data is typically broken down by three characteristics:

  • Volume:How much data
  • Velocity:How fast data is processed
  • Variety:The various types of data


Big data implies enormous volumes of data. It used to be that employees created data. Now that data is generated by machines, networks and human interaction on systems like social media the volume of data to be analyzed is massive. The tons of information in any of the billions of network poured in a single day can make easily understand the ever growing volume of data. Further research on Big Data is focused on making this volume available for various kinds of analysis in enterprises or business organizations. For instance, billions transaction data in a retail chain can be subject to analyze buying trend or consumer’s buying frequency for select products or for example trillions of fuel bills can be subject to analysis for next vehicle fuel policy.


Variety refers to the many sources and types of data both structured and unstructured. We used to store data from sources like spreadsheets and databases. Now data comes in the form of emails, photos, videos, monitoring devices, PDFs, audio, etc. This variety of unstructured data creates problems for storage, mining and analyzing data. The widest possible variety of data types is one aspect corresponding to Big Data analysis that is going to pave the way for numerous benefits for big to small, all sorts of organizations. Big Data comprises any type of data, both structured and non-structured. It can be audio visual, graphic representation, spreadsheets and log files, 3D images to simple text to click links or simply anything. When these multifarious types of data are analyzed together they may provide great range of insights for particular researches. For instance tons and tons of text messages over a football match can be analyzed in contradiction to measurably small number of actual spectators which may indicate a necessity for changing marketing and publicity tactics for the event managers and organizers of the match.


Big Data Velocity deals with the pace at which data flows in from sources like business processes, machines, networks and human interaction with things like social media sites, mobile devices, etc. The flow of data is massive and continuous. This real-time data can help researchers and businesses make valuable decisions that provide strategic competitive advantages and ROI if you are able to handle the velocity. Big data has much more bigger implications in time sensitive business processes than others. It is always a hurried process to analyze to scrutinize maximum volume of data for a potential business objective like catching a fraud in transactions or locating the exact reason of why clients of a particular business process are not coming back. Only faster scrutinizing capability that can handle large volume of data in real time can translate into business benefits. Faster processing of time sensitive data is to give you an edge in fault finding or finding the hidden loop in the process, that is exactly one of the demands of Big Data that is increasingly becoming crucial.


Big Data Veracity refers to the biases, noise and abnormality in data. Is the data that is being stored, and mined meaningful to the problem being analyzed. Inderpal feel veracity in data analysis is the biggest challenge when compares to things like volume and velocity. In scoping out your big data strategy you need to have your team and partners work to help keep your data clean and processes to keep ‘dirty data’ from accumulating in your systems.Accuracy or trustworthiness of information is one aspect that challenges the use of data in business analysis or trend analytics. While many business managers and top decision makers are still skeptics about the accuracy and corresponding outcome of business analysis based on various sources of data, when this body of data grows enormously bigger to contain various contradictory trends and aspects it can as well be a good basis for determining the accuracy. As the volume grows bigger in Big Data analysis, the efforts motivated by partial observation becomes futile and thus exceptionally Big Data reserves when handled properly can render more accurate observations.

The Four V’s – Volume, Velocity, Variety, Veracity Infographic

,Just in case you ever needed an infographic for the 4 V’s of big data, IBM has one for you,

4 V's of big data: volume, velocity, variety, veracity

Mircosoft: Connecting Kopparberg to growth


   Image result for kopparberg

Kopparberg is a subsidiary of Kopparberg Brewery Sweden that operates both in Britain and Northern Ireland. It was launched in 2006 and has been a tremendous success, becoming Britains largest fruit cider brand in both the on and off trade. It has a staggering £120 million turnover per annum, helped greatly by Microsofts suite of solutions. The Kopparberg team has a unique structure for a business of its kind with the majority of staff working remotely. The team spends most of its time visiting prospects and customers around Britain and Northern Ireland, so being able to link up with the main office to share information in real time is crucial to growth, so this was an area of the business that needed to be addressed. Through a Microsoft partner, “leaf”, a number of Microsoft solutions was implemented within Kopparberg that significantly improved performance. Its made the team 15 percent more efficient and significantly reduced lead times from prospects to clients.

Image result for lync  Image result for lync

Adopting “Lync” improved communication while removing the need for physical face-to-face meetings, this was very useful for the sales team when they were briefing new business and team members. Kopparberg needed streamlined communications and secure back-up to ensure continuity of sales and order processing. The implementation of “OneDrive” ensured that relevant information could be shared across the team instantly and securely. Management can set permissions on files and share them with select team members as necessary, further heightening security. The previous method of emails created extra administration and sometimes confusion.


Image result for microsoft crm Image result for microsoft crm

Microsoft Dynamics CRM, which was deployed not, too long ago, has made a positive impact. Its become an invaluable part of the business as the management can now collate information from the sales team on the road and upload it to a central point. Kopparberg now have the information to really focus on marketing and distribution efforts. They can now identify specific venues in certain areas of the country that do not stock a particular product of ours, thereby helping us pinpoints marketing and sales spend. Kopparberg has found the Microsoft process to be efficient. It was easy to sell into the business as the benefits of integration were obvious from the outset. Thanks to the ease of use of “Lync”, “OneDrive” and Office365, Kopparberg can rapidly train staff and engage new staff as well as minimise the impact when existing staff move away from the organisation. Kopparberg plan to add additional improvements to its systems, including email tagging linking directly to the CRM, enabling Kopparberg see any pertinent customer or prospective customer trends. They are acutely aware they live in a “Microsoft world” and chose to be there as it allows our team to be on the road, more productive and more successful.

Microsoft the softly, softly approach I don’t think so!

Image result for microsoft

Microsoft simply wants to make business smarter, and it is on a mission to simplify BI and analytics by making it accessible from the heart of a business. The last 15 years has seen Microsoft extend its reach with enterprise applications, expanding out for for its best- selling Office suite to encompass ERP and CRM. BI and analytics are also part of the mix, but we must be very clear about the difference. “BI is about dealing with facts, maybe merging multiple data sets and connecting dots across business. Analytics is more scientific, predicting the less obvious. It’s usually aligned with innovation and more agile decision-making. What Microsoft has seen  is a shift in trend were a core group of business people stand up and take ownership of data. They want to leverage it as an asset and bring technology to bear on it. Microsoft reckons that data cleansing or as they like to call it “data wrangling” takes up to 50% of an organisations analytics project time that is why they are setting about making people more productive with data. The first move was to make BI more accessible, by leveraging the power and familiarity of Excel at the front end and the ubiquity of SQL server at the back, they have been able to developed self-service tools and procedures to mask complex models and visualisation techniques, something you could only do with specialised tools 15 years ago is now a simple click in Excel today. You can have three dimensional models that live on your pc and don’t require a ton of sever horse power. Microsoft is fully aware that this is only one part of the intelligence jigsaw and has been working hard to slot in the big data piece. Bigger projects that leverage open source super compute frameworks like Hadoop can be accessed through Azure Microsoft`s cloud-based development platform.

“R” Is It Really A Global Phenomenon?


Well it was founded and created in New Zealand, so that’s not a bad start, when you have a situation that every data analysis technique is at your fingertips and “R” has the luxury that it can draw upon virtually every data manipulation, statistical model, and chart that the modern data scientist could ever need. “R” has the capabilities of creating beautiful and unique data visualizations, that go far beyond the traditional bar chart and line plot, from the simplicity of variables, vectors and data fames to the stunning infographics of multi-panel charts, 3-D surfaces and more “R” has the “lot”. These custom charting capabilities of “R” are featured worldwide in so many different domains, ie(The New York Times and the Economist).

Image result for THE ECONOMISTImage result for new york times

“R” is a masterly, proficient and skilful tool that gets better results faster, it does not have to use point-and-click menus, it is designed expressly for data analysis. Intermediate level “R” programmers create data analyses faster than users of legacy statistical software, With the added bonus that it can mix-and-match models for the best results. It should also be noted that “R” scripts are easily automated, promoting both reproducible research and production deployments.

Image result for r languageImage result for IMAGES OF R DEVELOPERS


“R” is without doubt a global community,it has more than 2 million users and developers who voluntarily contribute their time and technical expertise to maintain, support and extend the “R” language and its environment, tools and infrastructure. At the centre of the “R” community is the “R” core group of approximately twenty developers worldwide who maintain “R” and guide its evolution. The official public structure for the “R” community is provided by the “R” foundation, a not for profit organization that ensures the financial  stability of the “R”-project and holds and administers the copyright of “R” software and its documentation.

Image result for IMAGES OF R DEVELOPERS   Image result for IMAGES OF R DEVELOPERS



“Trickle, Trickle, Trickle”( “Simple”,”Fast”,”Cheap”) Part 1

Image result for trickling effect

Data analytics has trickled down from large corporates and is now readily available in the mainstream. No longer the sole preserve of large corporations, it is more accessible, more immediate and more affordable. When you think about it “computers” were invented for analysing data, and really “big data” has been around for ages and has just been rebranded. Organisations are swamped in data; a steady trickle from accounts and ERP packages was swelled with the advent of email, e-commerce and the increasing use of CRM. Now with new technologies able to scrape unstructured and as well as structured data and “big data” entering the lexicon of business language, we are waking up to a deluge. The challenge is where to start turning it all to business advantage. Firms will find different reasons to surface data for analytics. Its sector specific and depends on where the company is coming from.

Image result for accenture  Image result for accenture

Take Accenture as an example, they are seeing a lot of focus on getting a better understanding of customers and customer behaviour. Companies are looking to leverage broad sets of data to get a holistic view that allows them to engage in a more personalised and targeted way. In manufacturing Accenture is seeing is a focus on operations, leveraging sensor driven data to send out alerts on the imminent failure of a device, a good example of this would be financial institutions and how they focus on risk, identifying deviations in data that exposes fraud sooner rather than later. They key challenge for companies is trying to corral their data and get it in order, trying to identify the sources of the data and whether they can trust it. Despite all the hype around big data, organisations in Ireland are still relatively immature and struggling with the fundamentals. Accenture estimates that anywhere between 20 and 70 percent of a project is the data cleansing piece, making the data ready for analytics, long established processes such as ETL(extract transform load) are still used, but whats changed is the expectation of faster results. Making this possible are new technologies such as “Hadoop” that can crunch vast amounts of data quickly. If a bank wants to measure the impact of closing down a branch, for example, where you are not as concerned about the quality of data, but want to get a quick answer to a quick question then Hadoop will do the job. It will be able to tell you the systemic impact of closing it down.

Image result for hadoop  Image result for hadoop

R (Programming Language), A Quick Introduction:

“R” is a programming language and software environment for statistical computing and graphics. The “R” language is widely used among statisticians and data miners for developing statistical software and data analysis. It is also called “GNU S” and is a strong functional language, with a big emphasis on linear and non-linear modelling, classical statistical tests, time series analysis, clustering and classification. The “R” language offers an open source to participation in the area of statistical methodology, and one of its great strengths lies in its ability to design high quality publication plots, i.e. (mathematical symbols and formulae) and secondly it is designed around a true computer language, so it allows users the opportunity to add additional functionality by defining new functions.

Image result for r language  Image result for r language

Why use the “R” language

Well it is both flexible and powerful, and really importantly it is designed to operate the way that problems are thought about, “R” is not just a package, it is a language, there is a difference it has to be said, with a package you can perform a set number of tasks – often with some options that can be varied. A language allows you to specify the performance of new tasks. One of the goals of “R” is that language should mirror the way that people think. Ok let’s take a simple example, suppose we think that weight is a function of height and girth. The “R” formula to express this is: “ weight ~ height + girth” so very simply The + is not “as in addition”, but as in “and”.

Another feature of the “R” Language is that it is vector-oriented – meaning that objects are generally treated as a whole – as humans tend to think of the situation – rather than as a collection of individual numbers. Suppose that we want to change the heights from inches to centimetres. In “R” the command would be “ <- 2.54 * height. Inches”, here height. Inches are an object that contains some number – one or millions – of heights. “R” hides from the user that this is a series of multiplications, but acts more like we think – whatever is in inches multiply by 2.54 to get centimetres. Over the last decade “R” has become the most powerful and widely used statistical software, it has without doubt enhanced itself as the most popular language for data science and an essential tool for finance and analytics-driven companies such as Google, Facebook, and LinkedIn. We might explore a little bit more about “R” in my next “post”, and in the mean time we might ask Larry!



Fusion Tables and Heat Maps

This heat map created is a graphical two-dimensional representation of the data census figures from 2011 with the random distribution of counties and their boundaries across Ireland, their values are represented by colours. This fairly simple heat map provides an immediate visual summary of information. I have used colour to communicate relationships between the two data values that would be much harder to understand if presented numerically. Ok in the, Irish population example, I got two tables: one contains county-by-county population figures, the other one contains geographic information of each state and its borders and I uploaded them to fusion tables, were I merged the two tables. I was able to then visualize the new merged table on a map and subsequently applied a style to the map. The fusion table was able to pull county names, population figures and border information and outline the facts onto a base map.
Counties are coloured and categorized according to their population density. The graduated colour scheme allows for a quick and easy analysis of this data. It is evident from the map that the yellow coloured counties (e.g. Dublin, Galway and Cork) are the most densely populated areas with more than 250,000 people living in these counties. It is also apparent that the orange coloured counties (e.g. Leitrim) are the least densely populated with between (15,000 and 55,000) living in these areas. It is also therefore easy to surmise that the population density of the light green shows a big concentration of between (55,000 and 87,000) right up through the midlands and into Sligo.

By using the filter option on heat maps you can isolate different values take total population for instance this can be selected and used to see these values mapped in relation to Ireland. Secondly heat maps can be sorted in ascending or descending order. This is useful when trying to decipher quickly which areas have the lowest and highest values (e.g. figure out which county has the most females), or it is excellent when trying to quickly search for areas with specific values (you want the results on male population in Ireland so that counties with between 65,000 and 150,000 males are only displayed on the map, as shown below. This map could also provide an interactive and visualization aid for the distribution of the elderly population across Ireland.

From a conceptual point of view the preparation of heat maps can now cover a wide range of variables, not just population figures and distribution of counties. Areas like religion, nationality, education, social class, industry of employment, occupation, housing, cars per households and health and disability all fall under this theme, you could even have a scenario were you have a number of different variables to show the percentage of households with central heating powered by peat and map them accordingly, you might even be able to elaborate on the type of fuel used, the type of sewage system, the tenure(is it owner occupied or rented) or the actual type of housing unit(detached, one bed apt etc.). The possibilities are endless and one thing is for sure it without doubt shows very clear patterns throughout the country for whatever geo-spatial mapping you require.

Coaching by Numbers: Is Data Analytics the Future Of Management?


Image result for mourinho

Image result for ancelotti

Maths over Mourinho? Analytics over Ancelotti? Data analysis is now commonplace in both the sporting and business worlds, but does human decision making still dominate in management, let us investigate.


Image result for moneyball

Most of us are familiar with the 2011 film moneyball, were the killer weapon in the movie is data, and they place their trust in computer-generated algorithms rather than common sense. The film spurred a great deal of speculation about the idea that technology may eventually replace sports managers. The underlying logic to this is twofold. First, computers are able to gather and process much more data than humans do, which enables them to better predict future performance, and secondly unlike humans, computers are not biased by emotions or subjectivity, so their decisions are bound to be more rational than ours.


Data alone is trivial, indeed it is only when combined with expertise, experience and knowledge that   data can enhance our ability to make the right decisions. The point of data is to refine our intuition, but, at the same time, a great deal of intuition is needed to make sense of any data. Unless you know what to look for, the data will only show numbers. This is why experts are capable of making intuitive decisions that mirror data-driven decisions.


Image result for machines and humans   Image result for machines and humans

Humans are only partially rational, because of this, a purely rational approach to managing people does not work. This why sports athletes need human coaches, who can tune into their emotional states and empathise with them. Of course it may be possible to refine artificial intelligence to mimic human coaches in this task, but a fundamental difference between machines and humans will remain, namely that machines won’t care about the athletes – at best, they will be able to fake feelings for them but they will still seem pretty unbelievable. Athletes are pre-wired to respond more emphatically to humans than computers. Having your coach watching you creates a strong process of psychological influence, called leadership, which machines will never manage to imitate. Thus even if data does a good job at diagnosing problems, the intervention – acting on those problems, including making decisions and influencing athletes – is best left in the hands of humans.


  Image result for talent managementImage result for talent management

The application of technology and data to sports management mirrors the wider realm of business. Consider the field of talent management, the area of human resources concerned with the selection, motivation, and retention of employees, especially at the top of the organisational hierarchy. Despite substantial technological developments in this area during the past decade, big data and computer-driven algorithms have yet to have a real impact on management practices. Sure, it is now easier, faster, and cheaper to find suitable employees for a job, to quantify their contribution to the company, and to make data-driven decisions regarding rewards, promotions and retention.


Image result for sports analytics

However, few organisations have adopted such tools widely, and those who have are not obviously more effective than their counterparts. Besides, there is a high price for the dataification of management practices. First, despite the objectivity of such practices they are unlikely to be perceived as fair by the workforce. Second, making these practices transparent increases the probability that individuals play or game the system (just like hotel owners may fake their tripadvisor ratings, or those of their competitors). Third, when transparency is avoided ethical issues and anonymity concerns emerge. For instance, most companies would learn a great deal about their employees by mining their e-mail data, but I suppose who would want to work in a place like that.

Image result for sports analytics Image result for sports analytics


In short, sports analytics, computer-driven algorithms, and big data can certainly improve human decision-making in the field of competitive sports, but so long as the athletes are human, technology alone will not improve their performance. Data can help us make better predictions, but it will not make people more predictable than they already are. Finally most coaches, clubs and managers have access to the same quality and quantity of data, but significant differences between their performances remain because human decision-making still dominates the game, so despite the appeal of sports analytics it is fairly unlikely that jose mourinho or carlo ancelotti will be out of work soon.

The Big Data Revolution


Image result for big data

How do we define big data?

Ok while I fully expect every individual or company to add its own personal tweaks here or there, here is the one-sentence definition of big data to get he conversation really started.

Big data is a collection of data from traditional and digital sources inside and outside of a company that represents a source for ongoing discovery and analysis. Some people like to constrain big data to digital inputs like web behaviour and social network interactions, however we cannot  exclude traditional data derived from product transaction information, financial records and interaction channels, such as call centre and point-of-sale. All of that is big data too, even though it may be dwarfed by the volume of digital data that is now growing at an exponential rate. In defining big data it is very important to understand the mix of unstructured and multi-structured data that comprises the volume of information.

Image result for big data  Image result for big data

Unstructured data comes from information that is not organized or easily interpreted by traditional databases or data models, and typically, it is text heavy. Metadata, Twitter tweets, and social media posts are good examples of unstructured data.Multi-structured data refers to a variety of data formats and types and can be derived from interactions between people and machines, such as web applications or social networks. A great example of this would be web log data, which includes a combination of text and visual images along with structured data like form or transactional information.

Image result for big data

Every enterprise needs to fully understand big data – what it is to them, what it does for them, and what it means to them. The importance of big data is immense, this can be achieved through a multitude of different features which offer a broad spectrum within organisations, examples of these could be a data analysis tool, data warehouse testing, data asset management and comparative data analysis all of which provide interactive aids in the big data phenomenon.

Image result for big data  Image result for big data

Information is arguably the most important fuel businesses run on. Intellectual property such as patents, institutional knowledge collected and stored by employees, sentiment gleaned from millions of social media posts, and consumer insights from the analysis of myriad online transactions are just a few examples of information assets companies leverage today. Companies all over the world need to wake up to the reality that information governance is more important in the era of big data than it was beforehand. New big data tools leveraging technology such as Hadoop can process and analyse high volumes of data at reasonable costs, creating business intelligence that companies can use for competitive advantage. The beauty of Hadoop is that business users can keep everything which is crucial because organisations do not want to archive or delete meaningful data.

Image result for hadoopImage result for hadoop

Ok this brings us on nicely to “what is meaningful data” or “how does a company know what data is meaningful”, Business Intelligence or (BI) programs can make sense of structured data, giving companies a good – or even exact – sense of what data is meaningful. The percentage of a companies information volume that consists of structured data is surprisingly fairly small, so information hoarding in order to leverage big data tools may work in the structured data world, but will not work in the broader information world that includes unstructured content, most of which is duplicate information or unnecessary (think of all the junk and transitory email). You might ask, how do we know what to delete?, well according to the experts the current methods of information classification are inconsistent and do not scale well. The most deleted content? Email which is broadly time-based deletion, meaning that companies delete email after a certain amount of time, this ultimately could lead to deleting valuable information. What is needed is a way to analyse information automatically, with some human review to judge its business value. While BI has gained mainstream traction in the structured data world, content analytics have not yet in the unstructured content world. What companies must understand is that big data and the intelligence it can deliver is good and worthy of embracing, and that effective information governance not only helps make business operations more efficient, but very importantly mitigates risk. Most organizations are so busy just trying to manage structured information that they haven’t yet addressed unstructured content , much less given enough attention to litigation risk associated with information. Now is the time.

Image result for business intelligenceImage result for business intelligence