Skip to main content

Digital Library of India: Download all that you can...

[:http://dli.iiit.ac.in/|Digital Library of India] has been unveiled, but with a shocker of an interface. But not much can be expected out of a "Government of India" project, as they always manage to find just the right technologies (or people?) for their job (Why, e-governance in India is all ready to go Microsoft's way. When M$ boasts riches, we can all show our kids its logo and say that our government of poor people is one of their key customers). They use TIFF format for image scans of thousands of books probably from libraries all over India. There are two petty interfaces which need you to download a software to be able to view. The software, in turn, needs to be registered to be able to use. Last month, it was the [:http://ildc.gov.in/Kannada/kdownload.htm|disappointing set of tools] released by TDIL, and this month, the DLI. Not to mention, the projects are obviously worth a lot, but shabbily done. Idea seems to be right, but implementation has been terribly bad.

A Quick Script

To overcome the toil between interest for books on DLI (which otherwise are not easily available) and the irritation of shabby interface, I wrote a (shabby) script that batch downloads the TIFFs. It is a quickly written script with pieces from here and there that has many stupid parts (which undoubtedly would be mine). But mainly it works, just like the projects I've been mentioning here. Serves right, in a way. It needs you to paste URLs pointing to TIFF of starting page for each book, with the filename removed. Pretty clumsy, yes. But that was convenient for me, since I removed the frames from their web page while viewing, browsed through the list, and clicked on each to check the quality. Saves the irritation for the next many pages. Ah, and you'll need to paste the URLs onto a file. For use on just a single book, it is easy to modify, anyway. Try it, modify it and let me know if you improve it. #!/bin/sh #Get your favourite book from DLI: Specify the start page and end page, and this script takes care of the rest. #Caveat: you'll need to specify the base URL, though. PATH=/bin:/usr/bin:/usr/local/bin progname=`basename $0` case $# in 0) 1>&2 echo $progname: usage $progname start end; exit 1 ;; esac start=$1 end=$2 prefix=0000 echo "Enter path for the file to read:" read file #exec > $HOME/log_dli.txt x=1 lns=`wc -l $file` echo "LNS: $lns" index=`expr $start` while [ $x -lt $(wc -l <$file) ] do url=`head -n $x $file | tail -n 1` index=`expr $start` mkdir $x cd $x while [ $index -le $end ] do if [ $index -lt 10 ] then digits=000 elif [ $index -ge 10 ] then if [ $index -lt 100 ] then digits=00 else digits=0 fi else echo "bah. \n" fi WGET_OUTPUT=$(2>&1 wget --timestamping --progress=dot:mega \ "$url$prefix$digits$index.tif") # wget $url$prefix$digits$index.tif if [ $? -ne 0 ] then # wget had problems. echo 1>&2 $0: "$WGET_OUTPUT" Exiting. fi if (echo "$WGET_OUTPUT" | fgrep 'Not Found') > /dev/null then break else echo "~~~~ Page found. Downloaded. ~~~~ \n" fi index=`expr $index + 1` done cd .. x=`expr $x + 1` done


Note: Make sure the URL is in the form of http://dli.iiit.ac.in//server12/disk3a/TO%20SUBMISSION/KANNADA/Bharatiya%20Tatva%20Shastra%20Samgraha//PTIFF and each URL is to be placed on the file in new line (the script doesn't detect empty lines).

See also:
Enjaaay!

Comments

Namaskara sir. I recently come to know about Digital Library of India. I copied your shell script and run but i dint got how to give url of file. Is it neccesary to download .tif file. Can i give URL link of tif file? I downloaded file and gave the path of the file to shell script and run the script it created nearly 500 empty folders. Is i followed wrong way? I want to download Some books of Shivarama Karantha. Please help me Regarding Sathish Nayak

You'll need to put the URLs in a text file and specify the path to that file when the script asks for it. The URLs should be of the form http://.../PTIFF/ (Paste it from the Image Link for each book) Besides "Cut & Paste" use of this script is not recommended. The script doesn't do too many fancy things other than doing what it is intended to do when the input is right. Feel free to improve it.

ನಮಸ್ಕಾರ ನಾಡಿಗರೆ, ನನಗೆ.. ಸ್ಕ್ರಿಪ್ಟ್ಸ್ ರನ್ನ್ ಮಾಡಿ.. ಗೊತ್ತಿಲ್ಲಾ.. ದಯವಿಟ್ಟು.. ಇದನ್ನ ಹ್ಯಾಗೆ ರನ್ ಮಾಡಿ books download ಮಾಡಬೇಕು ಅಂತ steps by steps ಹೇಳಿತೀರಾ.. (ಎಲ್ಲಾ ಆಂಗ್ಲ ಭಾಷೆಯಲ್ಲೇ ಹೆಳಬಹುದಿತ್ತೇನೊ.. ಏಕೆಂದರೆ.. ನಾನು ಬರೆದ 80% ಅದರಲ್ಲೆ ಇದೆ ಅಲ್ಲವೆ.. ಇರಲಿ :)) ನಿಮ್ಮ ಉತ್ತರಕ್ಕಾಗಿ ಕಾಯ್ತಾ ಇದ್ದೇನೆ.. ವಿನಾಯಕ

Dear Sir, Pls help me also, pls teach me ow to download books from digital library of india i want download some tamil books. Pls help me Mala

dear mala copy the script which has been written by this webpage author and then creat a text document which u can do by following under written steps 1)right click on desktop and go to NEW and then select text document then paste the script 2)go to file,then go and select save file as and then save as .vbs FOR EXAMPLE IF THE FILE NAME IS NEW.TXT SAVE IT AS NEW.VBS 3)u have created a script which runs when u double click over it 4)i have spent almost three days on how to create a script file so i dont want any more people to struggle. 5)i tried running the above script but i am getting error which this author only has to rectify

Dear Sharath, This is a shell script. http://en.wikipedia.org/wiki/Shell_script will give you more information. Cheers,

Script has some problem showing Error 1 snmoething.. Can you please provide accurate solution fot the script , so that books can be easily downloaded Waiting fot the reply thanks in anticipation

Himanshu, I checked the script and it is still working for me as intended. As much as I'd wish to, I don't have the time to improve it presently, but you're free to modify and improve the script. This is something I wrote quickly for my own use.

dear people happy news for u folks.i got a message from one of my friends on how to download files from DIGITAL LIBRARY OF INDIA WITHOUT BREAKING UR HEAD ON HOW TO CREATE SCRIPT PAGE OR SOMETHING OF THAT SORT WHICH LAYMAN WONT UNDERSTAND. INSTALL THOSE ALTERNATIFF PLUG-IN (I KNOW MOST OF THE PEOPLE ARE NOT INTERESTED) 1)HAVE A DOWNLOAD MANAGER(FLASH GET)WHICH HELPS TO DOWNLOAD FILES IN BATCH MODE 2)OPEN A PAGE FROM THE BOOK U WANT TO DOWNLOAD AND RIGHT CLICK ON IT, THEN U GET A MENU 3)SELECT COPY IMAGE LOCATION ,IT WILL B AS UNDR http://www.new.dli.ernet.in/data_copy/upload/0069/462/PTIFF/00000001.tif 4)NOW COMES THE REAL PART --->1)PRESS FILE IN FLASHGET SOFTWARE U GET --->ADD BATCH DOWNLOAD ----> THEN U GET A BOX ASKING FOR URL THERE U PASTE THE LINK SIMILAR TO ABOVE(THE MODIFIED ONE SHOWNN BELOW) SEE HOW MANY PAGES THE BOOK HAS.IT WILL B SHOWN AT THE BOTTOM END OF THE PAGE 5)REPLACE THOSE LAST 3 DIGITS BY (*)SINCE THE NUMBER OF PAGES IN THE ABOVE LINK I'VE TYPED IS 251 PAGES AND THE WILD CARD ENTRY IS 3 SINCE IT CONTAINS 3 NUMBERS(251) IT WILL B SOMETHING LIKE THIS http://www.new.dli.ernet.in/data_copy/upload/0069/462/PTIFF/00000(*).tif 6)CREATE A NEW FOLDER WHERE IT SHOULD BE STORED THEN PRESS OK ENTER THE PAGE NO FROM START TO END I.E FROM 1 TO 251 PRESS OK 7)NOW GO TO KITCHEN MAKE A CUP OF COFFEE AND DRINK IT BY THAT TIME IT WOULD HAVE STARTED DOWNLOADING 8)U MAY FIND THE ABOVE STEPS TOO CLUMSY BUT IT ISN'T.GIVE IT A TRY AND IF U FIND HELD UP SOMEWHERE LET ME KNOW. 9)I HAVEN'T TRIED BUT MY FRIEND HAS AND E HAS SENT ME A BOOK DOWNLOADED FROM THERE.WHICH I HAD ASKED HIM FOR. 10)I DONT KNOW HOW E CONVERTERD THEM TO PDF.I'VE ASKD FOR HIS HELP REGARDING IT AND SOON I SHALL LET U FOLKS KNOW IF U HAVE ANY PROBLEMS REGARDING ANY INSTALLATIONS OR ANY THING OF THAT SORT (DONT WORRY EVEN IF U FEEL THEM AS SILLY)I SHALL TRY TO HELP U TO THE BEST OF MY KNOWLEDGE

same procedure do it in internet download manager popularly called IDM for fast download after all tif files downloaded, to convert into one pdf procedure is download cutepdfwriter it is a free software install it (2 files one is exe,other is gpl updater) after installing select one book folder of all tif files select all tif files give print / now select cutepdf writer thats all it will scan all image and open save directory give it a nam for that file pdf file will create

dear hpn thank u for the link.i dint knew it was shell script.sorry for my knowledge(half baked one)for guiding mala.this script runs in linux operating system or on a unix based OS so dont break ur head doing it just follow the above simple steps i have given and u shall get what u want.the above steps is for WINDOWS OS so plenty of users can take advantage of it (dont tell the people of DLI because those idiots(i am using this word bcoz i am convinced for the shabby work they have done) may design system so as not to download book even from flashget software(this is a free software available through the internet).enjoy the books after downloading don't forget to thank my friend for providing the solution to windows using people

http://sampada.net/blog/shreekantmishrikoti/18/12/2008/14836 ನಲ್ಲಿರೋ ಸ್ಕ್ರಿಪ್ಟ್ ಇದೇ ತರಹ ಇದ್ರೂ ಬಹಳ ಚಿಕ್ಕದಿದೆ .. ಥ್ಯಾಂಕ್ಸ್ ಟು PERL !

hi there, i am wondering with this script and it didn't worked for me .. i dont know how to get this script working for me may be i have find some solution from some of my friend .... who have mastered in this subject i am using ubuntu 9.10 as my platform any way than's for your work .... well have an great time on googling

Hello Hari, This is very nich script. But it is not mentioned about file format to save script and how to run this script on Windows based machines. Can you please let me know how to save and run this script? In which format to save the script? Please email me about the details to my email mentioned above. Regards Bala Krishna

Go to this link to download an application that will allow you to download books from the digital library http://www.sanskritdocuments.org/scannedbooks/dlidownloader/

use cutepdfwriter to make any document to pdf give print select cutepdf writer thatsall

Add new comment