Do you know what Captcha is ?
Do you know what Captcha is ?
The reason it's there because you who are filling a form or booking a ticket is human and not some sort of computer program as a computer programmer could write a code that could book a ticket millions and millions of time.Basically used for Authentication purpose.
The reason this works is because humans have no trouble reading this distorted characters and computer program simply cannot do it.
The waviness and horizontal stroke were added to increase the difficulty of breaking the CAPTCHA with a computer program.
BUT WAIT,
Roughly 200 million CAPTCHAs are typed by human every day.
Each time you write a CAPTCHA you waste 10sec of your time.
Humanity as a whole is wasting 500thousand hours every day
So can this Human Effort be used for something beneficial because you are doing something that a computer cannot do in that 10sec?
Answer is YES
few years later a project started reCaptcha, You would have seen this
So what you may not know is while you are typing a CAPTCHA not only you are Authenticating yourself as a human but in addition, you are helping digitizing books.
There is various process digitizing a book which includes scanning page by page.Scanning gives you an image of every page of the book.Now to decide each word of the image they use OCR(Optical Character Recognition) which takes a picture of text and try to figure out text in them.Now problem is that OCR is not perfect especially for older books where ink have faded and pages have turned yellow.
So words which OCR could not recognize are sent to you as a reCaptcha. So next time you type a reCaptcha , words you are typing are actually words that are coming from books which have to be digitized and a computer couldn’t recognize.
The reason we have two words in reCaptcha nowadays is because one of the words the system just took out of the book it didn't know what it was and it doesn't know the answer for it ,it doesn't grade it for you. So they present you another word which the system does know the answer. They don't tell you which one is one and you have to type both .If you type the correct word for the one which the system already knows the answer it assumes you are human and it also gets some confidence that you type the other word correctly.If we repeat this process for the same reCaptcha to 10 different peoples and all of them agree on what the new word is they get one new word digitized correctly.
With this, a lot of websites switched from CAPTCHA to reCaptcha for example Facebook,twitter.
So through reCaptcha,
Number of words digitizing per day is roughly 100milions a day which is equivalent to 2.5 million books a year

Comments
Post a Comment