FAQS



Data Analytics is about understanding your data and using that knowledge to drive actions. It reveals the trends and outliers within the data which might be otherwise difficult to note. It is a scientific way to convert raw data into information that helps guide difficult decisions. A number of statistical tools and softwares are available to perform data analytics. The nature of data and the problem which needs to be solved using the insights from data guides the choice of statistical tools and techniques. Domain knowledge and expertise are also very important to interpret and apply the results obtained from analytics. Lastly, in our experience, the best data analysts are those who have the ability to dig into the data but can also layer common sense and domain knowledge into their recommendations.

Businesses are using analytics to make more informed decisions and to plan ahead. It helps businesses to uncover opportunities which are visible only through an analytical lens. Analytics helps companies to decipher trends, patterns and relationships within data to explain, predict and react to a market phenomenon. It helps answer the following questions:
What is happening and what will happen?
Why is it happening?
What is the best strategy to address it?

Collecting large amounts of data about multiple business functions from internal and external sources is simple and easy using today’s advanced technologies. The real challenge begins, when companies struggle to infer useful insights from this data to plan for future. Using analytics businesses can improve their processes, increase profitability, reduce operating expenses and sustain the competitive edge for the longer run.

Building analytics function requires long term commitment and extensive resources. An organization has an option to seek analytical help from in-house resources or from outside analytical vendors or use both in parallel. Any organization needs to spend considerable time and money to recruit and train in-house analytical help. At times they may not possess the required know-how to recruit such specialized staff or decide on the technologies that would be best suitable for carrying out analysis. In these circumstances they rely on analytical vendors like IQR Consulting. Such vendors can closely work with the management team to help the organization to adopt analytics. The organization has to trust and co-operate with the vendors while sharing their data and researching it to make the analytics engagement a success. Organizations can follow another model in which they build an internal team to manage their relationships with an external analytical vendor. Many analytically mature companies resort to this to supplement their internal efforts

A typical analytics project or engagement is generally divided into the following four stages:
Stage 1 - ‘Research’ where the analyst helps to identify and understand the problems and issues that the business is facing or would be encountering in the future. At this step there is significant interaction between the management team and analysts.
Stage 2 - ‘Plan’ where the analyst helps decide what type of data is required, sources from which the data is to be procured, how the data needs to be prepared for use and what methods to be used for analysis.
Stage 3 - ‘Execute’ where the analyst explores and analyzes data from various angles. The analysis paves way to interesting results that are shared with the management. Based on these results, strategies are formulated to tackle the problems identified in stage 1.
Stage 4 - ‘Evaluate’ where the analyst measures the results of the strategies formulated and executed. This stage helps learn and revise future strategies and processes.

A strategy built using analytics is a set of simple implementable recommendations that efficiently uses the information drawn from the data. An effective and efficient strategy suggests best use of the available business resources. It helps to find solutions for some of the biggest problems faced by the company. The process followed to formulate the strategy might be complex, but the final result is actionable and useful for management.

Analytics is not for one time or special event, yet it is a continuous process. The businesses should not take their attention off analytics and plan to adopt it as a regular business function. The business has to make collecting, cleaning and analyzing data a routine and a support role to functions that do not have the capability to do so. Most businesses look towards analytics when they face a problem and think the solution lies within their data. Once businesses start appreciating the potential analytics has to solve problems, they begin to use it to take all kinds of strategic and regular business decisions.

The resources and time required for an analytics project is dependent on a number of factors. The major factors being the scope and scale of the project, readiness and availability of required data, understanding of the analysis tools, skills and knowledge of the analytical team and most importantly, acceptance and approval from the management team to carry on the analytics project. The analytics team generally defines a project timeline dependent on the factors listed above. Intermediary findings and analysis difficulties might alter the goals and objectives of the project. This might require the team to re-work the time and resources required for completing the project. Deemsoft would be happy to provide you an estimate of the resources required to complete the analytics project and goals that you have in mind for your organization. Please contact us with details of your project.

Data is the most important resource for any analytics project hence the business should make sure that it captures its business and customer data in a structured manner. This will ensure that company has all the relevant data in the most usable form and can help the project move along quickly.

Delays in analytics projects generally take place when the data rendered to the analytical team is not usable in its current form. The data needs to be structured, cleaned and mined to make it usable. This step can take from hours to days to months depending upon the size and form of data.

Deemsoft would be happy to talk to you more about the state of your data and more specifically how 'ready' it is for analysis projects. Please contact us with details of your project.

For analytical needs, an organization can decide to use data analysis softwares like SAS or SPSS, seek help from custom consulting companies like Deemsoft or even build data analytic capabilities in-house. Today companies are even using a combination of the above.
Each of the above options comes with their own pros and cons. An organization has to find which option would suit their analytical needs best depending upon the nature of their business and existing resources. The costs associated with these options are rarely same for any two organizations. Deemsoft provides free consultation to evaluate the solutions needed.

There are two types of models, predictive and descriptive. Descriptive models are good to explain what has happened and what is happening. Predictive models explain what would be happening and why. These models are increasingly being utilized to solve problems across finance, marketing, human resource, operations and other business functions. At IQR, we have seen these models being used in financial services, casinos, airlines, retail, telecom, insurance, healthcare and even manufacturing industries.
Increased competition has expanded the scope, the need and the use of predictive modeling. Businesses need to be more proactive than before to build or sustain a competitive advantage. They need to get answers for tomorrow even before it arrives.
Predictive models are created using past and present data to foresee happenings in future. These models are being built to find answers to some of the most challenging businesses questions. It helps to manage portfolio returns, retain customers, undertake cross-selling activities, organize direct marketing campaigns, assess employee attrition and absenteeism, manage risks and formulate underwriting criteria, predict inactive customer accounts, cope with customer service requests, plan inventory and much more.

“Big data” is an all-inclusive term used to describe vast amounts of information. In contrast to traditional structured data which is typically stored in a relational database, big data varies in terms of volume, velocity, and variety. Big data is characteristically generated in large volumes – on the order of terabytes or exabytes of data (starts with 1 and has 18 zeros after it, or 1 million terabytes) per individual data set. Big data is also generated with high velocity – it is collected at frequent intervals – which makes it difficult to analyze (though analyzing it rapidly makes it more valuable). Or in simple words we can say “Big Data includes data sets whose size is beyond the ability of traditional software tools to capture, manage, and process the data in a reasonable time.”

This question cannot be easily answered absolutely. Based on the infrastructure on the market the lower threshold is at about 1 to 3 terabytes. But using Big Data technologies can be sensible for smaller databases as well, for example if complex mathematiccal or statistical analyses are run against a database. Netezza offers about 200 built in functions and computer languages like Revolution R or Phyton which can be used in such cases.

Contrary to what some people believe, intuition is as important as ever. When looking at massive, unprecedented datasets, you need someplace to start. In Too Big to Ignore, I argue that intuition is more important than ever precisely because there’s so much data now. We are entering an era in which more and more things can be tested. Big data has not replaced intuition — at least not yet; the latter merely complements the former. The relationship between the two is a continuum, not a binary.

Roughly 80% of the information generated today is of an unstructured variety. Small data is still very important — e.g., lists of customers, sales, employees and the like. Think Excel spreadsheets and database tables. However, tweets, blog posts, Facebook likes, YouTube videos, pictures and other forms of unstructured data have become too big to ignore. Again, big data here serves as a complement to — not a substitute for — small data. When used right, big data can reduce uncertainty, not eliminate it. We can know more about previously unknowable things. We can solve previously vexing problems. And finally, there’s the Holy Grail: Big data is helping organizations make better predictions and better business decisions.

Not exactly. Though there is a lot of buzz around the topic, big data has been around a long time. Think back to when you first heard of scientific researchers using supercomputers to analyze massive amounts of data. The difference now is that big data is accessible to regular BI users and is applicable to the enterprise. The reason it is gaining traction is because there are more public use cases about companies getting real value from big data (like Walmart analyzing real-time social media data for trends, then using that information to guide online ad purchases). Though big data adoption is limited right now, IDC determined that the big data technology and services market was worth $3.2B USD in 2010 and is going to skyrocket to $16.9B by 2015.

Big data is often boiled down to a few varieties including social data, machine data, and transactional data. Social media data is providing remarkable insights to companies on consumer behavior and sentiment that can be integrated with CRM data for analysis, with 230 million tweets posted on Twitter per day, 2.7 billion Likes and comments added to Facebook every day, and 60 hours of video uploaded to YouTube every minute (this is what we mean by velocity of data). Machine data consists of information generated from industrial equipment, real-time data from sensors that track parts and monitor machinery (often also called the Internet of Things), and even web logs that track user behavior online. Major retailers like Amazon.com, which posted $10B in sales in Q3 2011, and restaurants like US pizza chain Domino’s, which serves over 1 million customers per day, are generating petabytes of transactional big data. The thing to note is that big data can resemble traditional structured data or unstructured, high frequency information.

Eventually the big data hype will wear off, but studies show that big data adoption will continue to grow. With a projected $16.9B market by 2015 (Wikibon goes even further to say $50B by 2017), it is clear that big data is here to stay. However, the big data talent pool is lagging behind and will need to catch up to the pace of the market. McKinsey & Company estimated in May 2011 that by 2018, the US alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions. The emergence of big data analytics has permanently altered many businesses’ way of looking at data. Big data can take companies down a long road of staff, technology, and data storage augmentation, but the payoff – rapid insight into never-before-examined data – can be huge. As more use cases come to light over the coming years and technologies mature, big data will undoubtedly reach critical mass and will no longer be labeled a trend. Soon it will simply be another mechanism in the BI ecosystem.

From cloud companies like Amazon to healthcare companies to financial firms, it seems as if everyone is developing a strategy to use big data. For example, every mobile phone user has a monthly bill which catalogs every call and every text; processing the sheer volume of that data can be challenging. Software logs, remote sensing technologies, information-sensing mobile devices all pose a challenge in terms of the volumes of data created. The size of Big Data can be relative to the size of the enterprise. For some, it may be hundreds of gigabytes, for others, tens or hundreds of terabytes to cause consideration.

In my opinion, it is absolutely essential for organizations to embrace interactive data visualization tools. Blame or thank big data for that and these tools are amazing. They are helping employees make sense of the never-ending stream of data hitting them faster than ever. Our brains respond much better to visuals than rows on a spreadsheet. Companies like Amazon, Apple, Facebook, Google, Twitter, Netflix and many others understand the cardinal need to visualize data. And this goes way beyond Excel charts, graphs or even pivot tables. Companies like Tableau Software have allowed non-technical users to create very interactive and imaginative ways to visually represent information.

The data scientist is one of the hottest jobs in the world right now. In a recent report, McKinsey estimated that the U.S. will soon face a shortage of approximately 175,000 data scientists. Demand far exceeds supply, especially given the hype around big data. However, to become a data scientist one does not necessarily follow a linear path. There are many myths surrounding data scientists. True data scientists possess a wide variety of skills. Most come from backgrounds in statistics, data modeling, computer science and general business. Above all, however, they are a curious lot. They are never really satisfied. They enjoy looking at data and running experiments.

It’s an interesting point, and I discuss it in Chapter 4 of Too Big to Ignore. If we look at the relational databases that organizations have historically used to store and retrieve enterprise information, then you are absolutely right. However, new tools like MapReduce, Hadoop, NoSQL, NewSQL, Amazon Web Services (AWS) and others allow organizations to store much larger data sets. The old boss is not the same as the new boss.

A few relatively small organizations that have taken advantage of big data. Quantcast is one of them. There’s no shortage ofmyths around big data, and one of the most pernicious is that an organization needs thousands of employees and billions in revenue to take advantage of it. Simply not true. I don’t know in the near future if my electrician or my barber will embrace big data. However, we are living in an era of ubiquitous and democratized technology.

It’s already happening. Big data is affecting our lives in more ways than we can possibly fathom. The recent NSA Prism scandal shed light on the fact that governments are tracking what we’re doing. Companies like Amazon, Apple, Facebook, Google, Twitter and others would not be nearly as effective without big data. As you know, most people don’t work in data centers. Rather, it’s better for people to know about the companies whose services they use. Are those companies using big data? These days, the answer is probably yes. By extension, then, big data is affecting you whether you know it or not. In addition, as more and more companies embrace big data, there will be major disruption in the workforce.

Another reason big data is starting to go mainstream is the fact the tools to analyze it are becoming more accessible. For decades, arcplan partners Teradata (NYSE: TDC), IBM (NYSE: IBM), and Oracle (NasdaqGS: ORCL) have provided thousands of companies with terabyte scale data warehouses, but there is a new trend of big data being stored across multiple servers that can handle unstructured data and scale easily. This is due to the increasing use of open source technologies like Hadoop, a framework for distributing data processing across multiple nodes, which allows for fast data loading and real-time analytic capabilities. In effect, Hadoop allows the analysis to occur where the data resides, but it does require specific skills and is not an easy technology to adopt. Analytic platforms like arcplan, which connects to Teradata and SAP HANA, SAP’s (NYSE: SAP) big data appliance, allow data analysis and visualization on big data sets. So in order to make use of big data, companies may need to implement new technologies, but some traditional BI solutions can make the move with you. Big data is simply a new data challenge that requires leveraging existing systems in a different way.

The Apache Hadoop software library allows for the distributed processing of large data sets across clusters of computers using a simple programming model. The software library is designed to scale from single servers to thousands of machines; each server using local computation and storage. Instead of relying on hardware to deliver high-availability, the library itself handles failures at the application layer. As a result, the impact of failures is minimized by delivering a highly-available service on top of a cluster of computers. For more info, see this Hadoop FAQ. Or Hadoop is a distributed computing platform written in Java. It incorporates features similar to those of the Google File System and of MapReduce.

Hadoop is one of the projects of the Apache Software Foundation. The main Hadoop project is contributed to by a global network of developers. Sub-projects of Hadoop are supported by the world’s largest Web companies, including Facebook and Yahoo.

Hadoop’s popularity is partly due to the fact that it is used by some of the world’s largest Internet businesses to analyze unstructured data. Hadoop enables distributed applications to handle data volumes in the order of thousands of exabytes.

Hadoop, as a scalable system for parallel data processing, is useful for analyzing large data sets. Examples are search algorithms, market risk analysis, data mining on online retail data, and analytics on user behavior data. Hadoop’s scalability makes it attractive to businesses because of the exponentially increasing nature of the data they handle. Another core strength of Hadoop is that it can handle structured as well as unstructured data, from a variable number of sources.

To many enterprises, the Hadoop framework is attractive because it gives them the power to analyze their data, regardless of volume. Not all enterprises, however, have the expertise to drive that analysis such that it delivers business value. Scaling up and optimizing Hadoop computing clusters involves custom coding, which can mean a steep learning curve for data analytics developers. Hadoop was not originally designed with the security functionalities typically required for sensitive enterprise data. Other potential problem areas for enterprise adoption of Hadoop include integration with existing databases and applications, and the absence of industry-wide best practices.

Hadoop originally derives from Google’s implementation of a programming model called MapReduce. Google’s MapReduce framework could break down a program into many parallel computations, and run them on very large data sets, across a large number of computing nodes. An example use for such a framework is search algorithms running on Web data. Hadoop, initially associated only with web indexing, evolved rapidly to become a leading platform for analyzing big data. Cloudera, an enterprise software company, began providing Hadoop-based software and services in 2008. In 2012, GoGrid, a cloud infrastructure company, partnered with Cloudera to accelerate the adoption of Hadoop-based business applications. Also in 2012, Dataguise, a data security company, launched a data protection and risk assessment tool for Hadoop.

The Hadoop JDBC driver can be used to pull data out of Hadoop and then use the DataDirect JDBC Driver to bulk load the data into Oracle, DB2, SQL Server, Sybase, and other relational databases.

The load operation is actually updating the index while you’re loading – the key is to make sure you’re not indexing while loading as it causes too many collisions and slows the whole process down.

Add the words “information security” (or “cybersecurity” if you like) before the term “data sets” in the definition above. Security and IT operations tools spit out an avalanche of data like logs, events, packets, flow data, asset data, configuration data, and assortment of other things on a daily basis. Security professionals need to be able to access and analyze this data in real-time in order to mitigate risk, detect incidents, and respond to breaches. These tasks have come to the point where they are “difficult to process using on-hand data management tools or traditional (security) data processing applications.”

First, security analysis is the examination of a multitude of phenomena for the purpose of detecting and/or responding to security incidents capable of impacting the confidentiality, integrity, or availability of IT assets. I would then define a security analytic as: A deduction based upon the results of interactions of multiple simultaneous security phenomena. The thing that big data security analytics technologies allow us to do is capture more data and perform multi-variable security analytics. In the past we relied on simple security analytics to help us trigger a response. For example: “Trigger a security alarm when someone has 3 failed log-in attempts on a critical system.” Effective but too simple and way too many false positives. With big data security analytics, we can generate security analytics that get much deeper: “Trigger a security alert when someone has 3 failed log-in attempts on a critical system when this activity is executed after hours from an employee device, the employee’s job responsibility is such that he or she should not be logging into this system, and the physical security system indicates that the employee is not in the building.” This is the kind of stuff that companies like Click Security, Lancope, and Solera Networks are working on.

No. Hadoop technologies are certainly built into some big data security analytics solutions from vendors like IBM and RSA, but there is no requirement for Hadoop per se. Lots of vendors have developed their own data repositories (in lieu of Hadoop) that collect, store, and analyze security data. In the future, it is likely that Hadoop and other big data technologies will find their way into big data security analytics solutions but there are plenty of leading big data security analytics solutions that don’t use or integrate with Hadoop at this time.

This is certainly one of the primary use cases but there are others as well. Many big data security analytics solutions are built using “stream processing” to accommodate the high I/O rate needed to process massive amounts of security data. In simple terms, stream processing distributes the processing load over a number of distributed nodes. Each node can provide local security analytics and the nodes combine to form a computing grid for more global security data analysis value. Big data security analytics built using this type of stream processing and grid architecture are designed for instant event detection and forensics. ESG calls this model, “real-time big data security analytics solutions.” ESG calls big data security analytics designed for the historical use “asymmetric big data security analytics solutions.”

Yes, those are the types of organizations on the leading edge but I would argue that all medium to large organizations need this type of security intelligence. Big companies will likely buy products and solutions while smaller companies will reach out to service providers like Arbor Networks (PacketLoop), Dell/SecureWorks, or the new SAIC spin-out Leidos. The best products and services will bake-in intelligent algorithms, intuitive visualization, and process automation.

My suggestion is to download open source tools like BigSnarf, PacketPig, or sqrrl. This isn’t an exhaustive list but I’ve hit the major areas. Hopefully, this will help security professionals move beyond the hype and start to understand how big data security analytics can deliver real value.

It’s two things: big data and the kind of analytics users want to do with big data. Let’s start with big data, then come back to analytics. Data isn’t big until it breaks 10Tb. So that’s the low end of big data. And some user organizations have cached away hundreds of terabytes--just for analytics. The size of big data is relative; hundreds of TBs isn’t new, but hundred just for analytics is—at least, for most user organizations.

No, there’s more to it than that. Size aside, there are other ways to define big data. In particular, big data tends to be diverse, and it’s the diversity that drives up the data volume. For example, analytic methods that are on the rise need to correlate data points drawn from many sources, both in the enterprise and outside it. Furthermore, one of the new things about analytics is that it’s NOT just based on structured data, but on unstructured data (like human language text) and semi-structured data (like XML files, RSS feeds), and data derived from audio and video. Again, the diversity of data types drives up data volume. Finally, big data can be defined by its velocity or speed. This may also be defined by the frequency of data generation. For example, think of the stream of data coming off of any kind of sensor, say thermometers sensing temperature, microphones listening for movement in a secure area, or video cameras scanning for a specific face in a crowd. With sensor data flying at you relentlessly in real time, data volumes get big in a hurry. Even more challenging, the analytics that go with streaming data have to make sense of the data and possibly take action—all in real time. Hence, big data is more than large datasets. It’s also about diverse data sources or data types (and these may be arriving at various speeds), plus the challenges of analyzing data in these demanding circumstances.

The kind of analytics applied to big data is often called “advanced analytics.” A better term would be “discovery analytics” because that’s what users are trying to accomplish. In other words, with big data analytics, the user is typically a business analyst who is trying to discover new business facts that no one in the enterprise knew before. To do that, you need large volumes of data that has a lot of details. And this is usually data that the enterprise has not tapped for analytics. For example, in the middle of the recent economic recession, companies were constantly being hit by new forms of customer churn. To discover the root cause of the newest form of churn, a business analyst grabs several terabytes of detailed data drawn from operational applications to get a view of recent customer behaviors. He may mix that data with historic data from a data warehouse. Dozens of queries later, he’s discovered a new churn behavior in a subset of the customer base. With any luck, he’ll turn that information into an analytic model, with which the company can track and predict the new form of churn.

Discovery analytics against big data can be enabled by different types of analytic tools, including those based on SQL queries, data mining, statistical analysis, fact clustering, data visualization, natural language processing, text analytics, artificial intelligence, and so one. It’s quite an arsenal of tool types, and savvy users get to know their analytic requirements first before deciding which tool type is appropriate to their needs.

An early extraction of survey data shows that only 30% of users responding to the survey are concerned about the technical challenges of collecting and managing big data. The vast majority – namely 70% percent of the users responding to the survey – say that big data is definitely an opportunity. That’s because through analysis the user organization can discovery new facts about their customers, markets, partners, costs, and operations, then use that information for business advantage.


     Deemsoft is a leader in the field of SEO, SMO, SEM, PPC ,WebSite Design and Software Development. We have years of experience and dedicated professionals who are eager to serve our customers. Quality and customer success is our top priority.

     Deemsoft is capable of handling multiple projects and we will start working on as soon as we get the contract approval.

     Deemsoft will honor its commitment and time line varies depends on the project. A typical website design will be completed in 10 working days. SEO will take around 3 months and SEM/PPC can be started as soon as website is completed.

     Customer satisfaction is our priority however in the event if we fail to meet your requirements we will reimburse 100% money back. as per the contract.

     Yes, We will manage it for you. PPC stands for "Pay Per Click" where you maintain Google Ad Words account. Even though it sounds simple but it is not. Since it is bidding system inexperienced user may over pay. It involves strategy and experience to optimize and reach effectively the target audience.

     Search engine optimization (SEO) is the process of affecting the visibility of a website or a web page in a search engine's "natural" or un-paid ("organic") search results. In general, the earlier (or higher ranked on the search results page), and more frequently a site appears in the search results list, the more visitors it will receive from the search engine's users. SEO may target different kinds of search, including social media, blogs, image search, local search, video search, academic search, news search and industry-specific vertical search engines.

     Social media optimization (SMO) is the use of a number of social media outlets and communities to generate publicity to increase the awareness of a product, brand or event. Types of social media involved include RSS feeds, social news and bookmarking sites, as well as social networking sites, such as facebook, Twitter, and video and blogging sites.

     Back link is the link refer to your website from other websites. This is important because search engines uses these links to rank your website. It is also important what kind of web sites back link you. Search engines are continuously upgraded to filter the spam websites. These spam website are created by some SEO companies just to back link and once search engine figure out this ranking based of these links will go down. Our company is will not engage in these kind of activities instead we work hard to find the relevance of your web site and its related blog, news and trade show web sites and link from there.

     Search Engine crawlers consider various things including the website content, meta tags, back links, social media, blogs etc. Here is the periodic table created by "SearchEngineLand"

     Penguin Algorithm is a code name for a Google algorithm update that was first announced on April 24, 2012. The update is aimed at decreasing search engine rankings of websites that violate Google is Webmaster Guidelines by using now declared black-hat SEO techniques involved in increasing artificially the ranking of a webpage by manipulating the number of links pointing to the page. Such tactics are commonly described as link schemes.