Big data in the end is not "fortune-telling"? Technology cattle say so

Although BAT occupies an advantage in data volume, it is limited in its richness and does not even have the big data capabilities of UGC in vertical areas. SMEs can take full advantage of their advantages in the vertical field of deep plowing, enhance the richness of the data up to gain the advantages of diversity.

Big data threshold

TBO: "If you have only a bunch of people's phone numbers, that might not make much sense, but it's valuable if you like Ctrip data, such as book, search, browse, and review information in advance, But the deeper core is whether you can use that data on a product or not, and that really helps, "said Jiao Yu, general manager of Data Analytics Intelligence at BI Hui, on TBO (Travel Business Watch).

The chief of the US delegation cloud big data platform agrees: "we must first figure out if the data we own is valuable and if anyone is willing to pay for it, and whether the richness of the source data can bring about the data value Supplement and perfect ".

Obviously, the purpose of data collection is not simply to bring the data together, but ultimately to make a difference in actual operations. Owning the data is only the beginning. How to analyze in depth and how the data are related to each other is the key to the application of big data. This is also the watershed of many large data companies.

However, in this process, there is a problem that can not be ignored, that is, the quality of the data. "Wrong input, in exchange for the inevitable is the wrong output," Han Xin, director of cellular data technology giant in an interview with TBO (Tourism Business Observer) specifically pointed out this issue.

"The real decision data mining success or failure is the quality of the data itself, but for the rational use of algorithms and optimization is rather secondary.Because of the rise of big data, we can easily get a lot of complicated data; However, simply looking from the advanced algorithm Getting the information we want while ignoring the quality of the data itself can often only be done in the air. "

For big data, the better the data, the better, because more data can produce scenes that better fit the real world, but at the same time more data produces more noise - so simple The amount of data does not increase the accuracy of the calculation.

So having high quality data is far more valuable than having a bunch of complicated data: This will not only reduce the difficulty of data mining, but also improve the accuracy of data mining. But is this the core threshold for big data?

Han Xin said: "The establishment of a complete big data system also needs two important factors, the richness of business and data thinking into."

Jiao Yu set out from his own practical experience, talked about their own views: "For a particularly good product manager, the threshold of big data is to understand this thing in the end what is the second strong modeling ability from these two For example, some companies have big data, but finding a cow to do this is, in theory, readily available, but in fact it is very difficult. "

"The first is big data." The second is that someone compares the data to 'oil,' that there is a treasure trove for oil, a tool for digging it out, and this is machine learning, and the third is the improvement in computing power The tools are stronger, not very powerful, or just not moving. "He Xiaofei, dean of the Institute of Dripping drops, gave such an answer.

The difficulty of data mining

Data mining, unlike the data collected to fill a few tables, ask a few questions can be easily achieved. Its relatively high professionalism, the use of knowledge, technical difficulty is also significantly increased. Therefore, most of the data mining is basically done by professionals or professional team.

In addition, the success of modeling also has a very important impact on the data presented. Different models, the results tend to be different.

"Anyone can take a model, just take the model and get the result, but does the result reflect the real world? Because the relationship between the data is not a direct linear relationship, so the model can be very complex, so you First you know what problem you are trying to solve: what type of problem does it have statistically, what are its characteristics, what are your limitations on data collection, and then find the model that is closest to the problem "Jiao Yu said.

"The challenge with data mining is the interrelated and contradictory relationship between the main data collection and the final application, which is similar to the question of 'chicken first or egg first.' The interaction between the two complement each other, Compared to other types of program development, it is a longer and more complicated process, "said Han Xin.

Whether it is Jiao Yu said the model, or Han Xin said the algorithm, in fact, emphasize an important point: the actual changes in the model and algorithm to make the appropriate adjustments. There are no fixed rules, only updated data and ever-changing situations, so the rules applied are also adjusted accordingly.

US Mission cloud big data platform is that the person in charge, how to get the "canonical data" is the real difficulty: "The new US daily p-level data generated, including a large number of businesses, users and interactive data; Hadoop by day , Hive, spark, storm and other big data tools for bulk and real-time cleaning, was able to form a standardized data.

However, perhaps the hardest part is the most practical one. The rapid development of technology provides many methods such as application of statistical methods, case-based reasoning, decision tree, rule-based reasoning, fuzzy sets, neural networks, genetic algorithms and other methods to deal with information, which not only reduces the difficulties of data mining, but also improves the data mining Efficiency and accuracy - but all of this requires a lot of money.

Many people may have heard about the brilliant case of using big data: Facebook stores about 100TB of user data every day; NASA processes about 24TB of data daily. What is the cost of processing these data?

According to Amazon Redshift, NASA needs to pay over $ 1 million for its 45-day data storage service. According to a survey conducted abroad, most enterprise CIOs say that their budget can not afford the cost of big data deployment, and the cost of data storage and processing is too high.

Big data really accurate?

"For a given area, the brain of the drip data has reached a prediction of over 88% accuracy 15 minutes earlier, and based on the forecast, it is possible to choose whether or not to schedule driver capacity so that drivers nearby can reach capacity early Scarce areas to ease possible congestion.For the travel field, the prediction of the future traffic conditions to help the smart scheduling. "Drop Institute of He Xiaofei once said to the outside world.

This is a positive case. On the other hand, if big data can not find a suitable solution for marketing, decision-making and operation of an enterprise, then its application prospect will not be favored by the enterprise. So big data in the end "prospective", from the very beginning is the business power of the most concerned point.

VOID Security Label Sticker

We can provide security labels with a self-voiding feature that includes your custom text or logo in the release pattern. After application, when these security labels are removed they will leave the custom release message on the surface and in the label material, which indicates tampering.

When you tear off the label, the material will show " VOID" mark on the surface. It is kind of tamper evident way.

Void Security Label Sticker,Void Security Labels,Void Warranty Stickers,Void Security Stickers

Shenzhen Tuteng Anti-Counterfeiting Co., Ltd. , https://www.holographicsticer.com