Thursday, 30 May 2013

Data Mining And Importance to Achieve Competitive Edge in Business


What is data mining? And why it is so much importance in business? These are simple yet complicated questions to be answered, below is brief information to help understanding data and web mining services.

Mining of data in general terms can be elaborated as retrieving useful information or knowledge for further process of analyzing from various perspectives and summarizing in valuable information to be used for increasing revenue, cut cost, to gather competitive information on business or product. And data abstraction finds a great importance in business world as it help business to harness the power of accurate information thus providing competitive edge in business. May business firms and companies have their own warehouse to help them collect, organize and mine information such as transactional data, purchase data etc.

But to have a mining services and warehouse at premises is not affordable and not very cost effective to solution for reliable information solutions. But as if taking out of information is the need for every business now days. Many companies are providing accurate and effective data and web data mining solutions at reasonable price.

Outsourcing information abstraction services are offered at affordable rates and it is available for wide range of data mine solutions:

• taking out business data
• service to gather data sets
• digging information of datasets
• Website data mining
• stock market information
• Statistical information
• Information classification
• Information regression
• Structured data analysis
• Online mining of data to gather product details
• to gather prices
• to gather product specifications
• to gather images

Outsource web mining solutions and data gathering solutions has been effective in terms of cost cutting, increasing productivity at affordable rates. Benefits of data mining services include:

• clear customer, service or product understanding
• less or minimal marketing cost
• exact information on sales, transactions
• detection of beneficial patterns
• minimizing risk and increased ROI
• new market detection
• Understanding clear business problems and goals

Accurate data mining solutions could prove to be an effective way to cut down cost by concentrating on right place.

We are online web research company specialized in providing comprehensive web based online research services and data mining services. We are providing high quality and accurate online web research services with an expertise in the field from last 17 years. For more details can visit our website: http://www.onlinewebresearchservices.com

Source: http://ezinearticles.com/?Data-Mining-And-Importance-to-Achieve-Competitive-Edge-in-Business&id=5771888

Monday, 27 May 2013

Screen-scraping , a transition technology in an industry undergoing transformation.

What is screen-scraping ?  Where does this strange name come from? Why is it so much used ? And why can we consider it as being just a transition technology ?

We don’t need any more to demonstrate that internet has changed the travel industry:  the players have grown in number, products have become more accessible, customers’ habits have changed…

From the client point of view, it’s amazing. But for the internet actors, this means an industry getting more and more complicated because of the number of connections between one another. This is how screen-scraping has became so popular.
What is screen-scraping ?

This technique appeared in 1999/2000, when Internet gained importance. It started booming in 2003, when contents were growing  and meta-search engines were becoming relevant.

In practical terms, screen-scraping consists in scanning the content of a web site, with the aim to use it again on another web site. To do that, screen-scrapers use web crawlers like the one used by search engines (Googlebots, Yahoo! Slurp…). The difference is that these robots are specialized in e-commerce web sites and are able to identify specific information: prices, dates, description… They read the HTML code of the web page in order to recreate the original data base of the web site.
The screen-scraping business model.

Several business models are used. The majority of meta search web sites get income with cost per click (CPC) and cost per action (CPA) models, then they redirect the internet user to the retailer. The problem with this model is that the two parties need a prior agreement to fix the earnings, compromising the meta-searcher  neutrality. Moreover, this commercial agreement isn’t visible for the internet user who thinks that meta-searcher results led him to the best deals.

In 2007, 11 of the 12 French meta-searcher were recalled in order by the DGCCRT.

So, some of them have made the choice of a business model calqued on Google. That means that they generate “organic results”, supposed to be as exhaustive as possible. At the same time they offer sponsored results consisting in text-ads (ex: Wego) or banners (ex:Sprice).

Finally, we can find the latest model on online travel agencies websites. They can’t accept to redirect their visitors to a partner website, because of their retailer business model. So in some case they have direct connections with suppliers, but in some other cases, when they don’t have partnerships or when technologies are incompatible, they develop an advanced form of screen-scraping. They have developed a copy of the booking process of the supplier and have integrated it on their own website.

Thus, it is not visible for the user that he is booking from an external web site. In this case, the seller can add his own income to the producer price. This technique is used by famous online travel agencies as Lastminute, Edream, Atrapalo, Govolo or Ebookers. Because it is not always used with common consent, this is one of the most controversial usage.
Troubles generated by screen-scraping.

So, we can identify 3 kind of trouble when we use screen-scraping:  technical (web crawler slow down web site performances), qualitative (a wrong setup can make a lot of mistakes)  and ethical (using screen-scraping without prior permission can be the beginning of drifts).

As we can see this last point is a recurrent cause of conflicts in our industry:

*2003: FareChase vs American Airlines
*2004: FareChase vs South West Airlines
*2008: Ryanair vs several Travel Agencies
*2008: EasyJet vs Expedia

And if we compare with e-commerce in general, we notice that screen-scraping is no longer used: meta search web sites and e-commerce web sites prefer XML catalogues.

In the travel industry, because of the complexity of the offer (products are perishable and prices varies with time), the tendency is more using API. This allows direct and reliable connection with stocks. It also allows to go beyond the standards imposed by GDS (cf: Air Canada case study).

Source: http://www.kraukoblog.com/travel-technology/screen-scraping-a-transition-technology-in-an-industry-undergoing-radical-transformation/

Friday, 24 May 2013

My Edmonton’s Many Data Sources

As I mentioned in my last post, there are over 30 different data sources used in My Edmonton, and a lot of different methods and screen-scraping were required.  Again, I thought it would be interesting to developers out there to find out how just how this was done, as screen-scraping (although legally grey) is a great way to produce your own API.

First, some background on screen-scraping.  Many different apps out there do this, as it's often the only way to get data for an application you need.  Some of my favorite apps that are based on screen-scraping are:

    Mint.com:  You know, the startup who Intuit (my former employer) acquired last year for untold millions?
    Pageonce.com : Another financial website/iPhone app that I personally love for managing my finances.  If only CIBC would stop blocking them...
    AppSales : an iPhone app only available for iPhone developers on jailbroken phones, that screen-scrapes Apple's sales reports website for data and presents it in a nice, easy-to-use format.  As one twitter poster wrote this morning (aptly timed) : "Anyone who thinks iTunes sales reports work well should be forced to visit a website where they download their email one message at a time."  This little app makes iTunes sales reports actually consumable.  This has recently been supplanted by Apple's own ITC mobile app, but AppSales is still far better and was 2 years ahead of Apple's.  Why Apple doesn't provide an API for this service is beyond me.
    Virtually every "cheap flight" website like Expedia uses screen-scraping behind-the-scenes to get flight schedule and price information.

So, sometimes you need to do a little screen-scraping to get by.

As I mentioned, My Edmonton uses over 30 data sources.  These include:

    23 datasets from the city's Open Data Initiative.  These include amenity locations, neighborhood and ward information, garbage collection schedules, transit data, and event and construction calendars.
    7 sets were scraped off the city's own website, including the assessment data.
    Geocoding API's such as Google, Yahoo, and geocoder.ca
    Canada Post (for postal code lookups)

All of these datasets combined provide the immensely useful set of features that My Edmonton provides.  For example, because My Edmonton knows the location of a certain home (thanks to the City's assessment data and Google Maps' Geocoding API), we can then look at various information:

    We can discover which Garbage collection zone the user is in without them having to manually look at it on a map.
    We can look at construction zones near their home.
    We can tell them which ward and neighborhood they are in, and provide details about how to call their city councilor.
    We can tell them their nearest bus stops, and the bus route schedules
    We can tell them the locations of various things like schools, playgrounds, etc. near them

To my knowledge, this is a novel use of the city's data and very useful.

Screen - scraping

This article was about screen-scraping, right?

We "only" had to screen-scrape about 8 different data sets - the city's website, and Canada Post.  Three different methods were used for these:

    when possible, the city's website was scraped with a recent Google acquisition of Needlebase.  Needlebase is on of the coolest tools I've used in a while.  You can point it to a web page, and give it "examples" of what to look for and how to parse data.  After about 3 examples, it usually gets everything right.  I used this to scrape City councilor data, as well as amenities not provided in the City's Open Data Catalogue, like Playgrounds, Tennis courts, etc.  I would have loved to gathered facility schedule information, like Kinsmen's pool hours, but that information is all in disparate formats and not scraper-friendly; some are in PDF, some websites.  Given enough time, I could gather this, but it wasn't worth it for this competition.
    Canada post was scraped using a custom Ruby script using a package called Mechanize.  This site is NOT scrape-friendly, as they have a kind of interview where you have to fill out multiple forms to get the data you want, and it looks like their site is Microsoft-based and all their form names were very long and convoluted.
    The city's assessment website was scraped with a custom Ruby script using nokogiri to parse the HTML.  Originally, my teammate Terry wrote it in C#, but it wasn't performant enough and didn't interface with a database, so I translated it to Ruby.  The script makes heavy use of caching to avoid any duplicate calls to the website.  As I mentioned, we've sent over 3.5 million requests to the city's servers in 2 months, giving us now 200,000 assessments.

I prefer not to post code, as there is a lot of it, but if anybody is interested in seeing code, feel free to leave a comment.

Source: http://blog.myyeg.com/2010/09/my-edmontons-many-data-sources/

Thursday, 16 May 2013

Expedia Data Scraping

With the online community there are plenty of companies that offer services. But some of these companies are not who they say they are, and do not produce the results they boast about. This is devastating to clients, and customers who purchase products and services only to feel let down in the end. One thing happening online right now are companies who claiming to offer high quality data scraping services.

Currently the kind of companies that participate in this kind of behavior are hurting the market. They are claiming to offer high quality web scraping for a low price, and these statements are just not entirely true. In fact they are offering low quality data scraping services for a very high price. They are taking advantage of people who are not knowledgeable about the business. Theses types of business practices are not favorable and are very unethical.

That is why our company could not have came at a better time. People want to do business with a company that has integrity overall. When you can mix high quality for low price in that equation as well, people will begin to notice. With all the bad companies around, people want something that is competent, cost effective, and high quality. That is a win situation for all the parties involved.

We are aware of how useful date scraping services are, and what it take to stand out in this industry. We have employees with experience in the field, and we can take care of any kind of web scraping service desired. If someone is trying to people looking for children’s toys online, we will put you in a position to market your products to them.

Our quality and level of service is second to nobody in the field. We work hard to separate ourselves from pack, and our work shows. If you interested in web scraping services, we are the perfect place for you. We are courteous and professional at all times. We want you to win, because it is our job, and we also have reputation to uphold. So don’t hesitate, we are waiting for you.

Source: http://thewebscraping.com/expedia-data-scraping/

Monday, 6 May 2013

Expedia poised to 'reinvent online travel', claims new UK chief

Expedia claims to have invented online travel 15 years ago and that it is now on the verge of reinventing it for the Web 3.0 age.

Speaking to Travolution this week, new UK managing director Andy Washington, who joined from Thomas Cook in June, remained silent on details of what the online travel giant has planned.

But he revealed the first signs of what Expedia is cooking up will start to emerge within weeks.

In his first trade interview since joining, Washington, who has 19 years experience in the travel industry having worked for lastminiute.com and Cosmos, said a new era beckons for Expedia.

The online travel agent has already hinted at what is to come in its latest brand television advertising which showcases its people and its expertise rather than its offers.

And this move away from commoditising hotel rooms or aircraft seats, from the global corporate behemoth to a more local, customer-centric entity will characterise its transformation.

Washington said the strategy in place at Expedia is “the most exciting” he has seen in his nearly two decades in travel.

“It’s all about the trip planning process, social media, the booking process and customer experience whilst in resort or the hotel and when they come back home as well," he said.

“Expedia has been a very corporate business that commoditised travel products and sold them online. They did not focus on looking after consumers, but we will do a much better job of that going forward.

“This is about Web 3.0 and how customer interaction evolves. It’s about interacting with social media, on mobile, online - everything is part of your booking experience.”

Washington could not reveal specifics, but said the next two years will see Expedia’s site undergo a transformation with the key aims of improving the planning stage of booking trips and increase loyalty.

As well as enhancements to the functionality of the core site consumers can expect new mobile apps, social media integration and an expanded product range.

“There are consumers who will always want a traditional search because they know what they want and they want to book it quick.

“But there is a customer base that wants to be inspired and does not know what they want. They want user generated recommendations but they also want recommendations from a travel company in a natural and impartial way.

“It’s not just about having every flight, every hotel in the market readily available, it’s about having the best flight, the best hotel and the best fit for that family and that can be done through web 3.0, through social media, rather than just having a full listing and being driven by price.”

Washington considers many holiday sites which boast about how many millions of deals they have on offer as more focused on industry brinkmanship than offering customers what they want.

“I have seen some dynamic packaging sites recently where they scrape Ryanair and you have to get through 30 pages before you get to any packages without a Ryanair flight,” he said.

Essentially, Expedia is looking to drive up the loyalty that eludes many online players who end up spending millions chasing the customers they do not already have.

“The best customer to have is the one that is loyal and is going to come back time and time again. It’s about making sure our product and our sites and everything we do is tailored to the customers rather than a one size fits all,” said Washington.

The vision for the new approach is being driven from the top by Scott Durschlag, president of Expedia Worldwide and formerly Skype chief operating officer.

Below him Gary Morrison, the former Motorola head of product management and Google global sales chief has been brought in to oversee operations in the EMEA region.

One rung down Washington, who joined after a short stint at the Thomas Cook online travel agency based in London, was recruited to run the UK office.

“We are bringing more travel people in and people with experience in e-commerce. We need additional resources for customer relationship management and customer experience.

“This is about saying we are experts in travel rather than just ‘here’s a website’. What the consumer wants the consumer will find on our site.”

Source: http://www.travolution.co.uk/articles/2011/09/13/5013/