Multiple HTMLs into a Single Document – MultiDoc2HTML v0.1

I was developing a program which allows me to take a bunch of HTML files and put it into a single document. These files need to be put in as rendered in IE and the HTML code tags should not be visible. Still all the links, etc. should be along in the document so generated. Manual way to do it would be to open the HTML files one by one in Internet Explorer or any other browser, press CTRL+A, then CTRL+C, open the word document, go to End of File (EOF) and then press CTRL+V. One of the questions is the use case. Why would I need to do something like this?

 

We are all aware of e-books in CHM (Complied HTML format). I have loads of them. And sometimes, the TOC (Table of Contents) gets corrupt due to one or other reason and I cannot navigate around. The other thing is that if I want to print, I cannot print all of the topics in the ebook. I need to select one by one and I don’t know if it is only on my PC, that I cannot get them to print properly. Most of the time, the output is blank pages. Also you loose the flexibility of converting to another favorable format such as PDF, etc. Hence I generally de-compile the HTML using Microsoft HTML workshop and then make a doc out of it. If you have 5 HTML pages to deal with, it is okay to do so manually. And I found it very boring with 5 anyways.

 

Read more about it, get the binaries and source code in the forum link: http://www.naresh.se/phpBB/viewtopic.php?f=17&t=11. The same link will also talk more on the implementation details about pipes, events, COM interfaces, Wrappers, etc. once I dissect the code and start with explanations interested people to follow. It also uses the Full Duplex anonymous pipes as explained in the post: http://www.naresh.se/2009/09/16/anonymous-pipes-in-windows/

 

Ahhh… and one more thing before I close the post. You might be wondering as to why I am writing code for doing the shit, when I can open word document, click on Insert File and select all the HTML files and it just works as my code does. Well, it does, but try doing it with 100 files and Word would simply not be able to do it. At least it didn’t work on my machine and I don’t know if my machines Word installation is fucked up or if Word seriously cannot do it.

 

Cheers & have fun computing…

Anonymous Pipes in Windows!

Recently I had been working on a program to get a set of HTML files rendered in IE and then copy the contents from the IE window to a word document. I wanted to do this in VC++ since it provides very easy access to COM objects like IE, Word, etc. and I have also used it in the past to build BHOs (Browser Helper Objects) and like. VB also supports COM but usually has pretty complex procedures to do minor things. And I forgot about .NET and the wrapper libraries around COM objects which provides very easy access. If I want to do something like that I would better use C# instead. But as it goes, I started coding in VC++. The basic logic of the code is as follows:
– Ask the user the select a directory with all the HTML files that you need to put in word
– Ask the user for the word document to which the content will be put to
– Open the specified Word document and keep a handle to the application or document object
– Open an IE Window (Invisible)
– Render the HTML file in the invisible IE Window
– Select all the contents from the IE Window and Copy it to the Windows Clipboard
– and paste the clipboard contents and save the file
– Repeat the step from rendering the HTML file onwards until all the files content are in the word doc
– Save and close the word file and the application object and release all the COM instances
Sounds a bit complicated and it is since I later realized that you can use the Word Objects InsertHTMLFile() directly instead of using Windows Clipboard which is unreliable as well as open to other rogue applications which can hang this particular application. But since I want to use it for my own purposes and not productize it, I am okay with the said approach. Again, I had almost finished my whole application until I found out that information. Wish Microsoft/Bing/Google could relate such information without me explicitly asking for it.
Anyways, so after almost writing 60% of the application, I realized that my Office 2003 installation is fucked up and that the COM Wrapper classes generated in VC++ after adding an ActiveX component for Word gives a lot of compilation errors. I tried a lot of different things including using #imports of the GUID for VBE and Word with the no_implementation keyword. None worked since removal of compilation errors would turn into runtime errors when I tried to call them in the code.
Anyways, then I decided to use a C# helper program to achieve the feat. And I decided to use Anonymous Pipes to communicate between my main VC++ app and the client C# app. I didn’t want my pipes to be used by any other program as well as wanted to limit the lifetime of the handles in case any of the apps crash. Hence I used anonymous pipes instead of named pipes. And since anonymous pipe is a half-duplex/suplex pipe, I have created easy classes to both implement a Full Duplex or half-duplex anonymous pipes depending on the needs. In most cases, one would require Full Duplex pipes to ideally do a proper communication.
Download the code from http://www.naresh.se/phpBB/viewtopic.php?f=17&t=10. Also post your follow up discussions and comments that same forum thread.

Recently I had been working on a program to get a set of HTML files rendered in IE and then copy the contents from the IE window to a word document. I wanted to do this in VC++ since it provides very easy access to COM objects like IE, Word, etc. and I have also used it in the past to build BHOs (Browser Helper Objects) and like. VB also supports COM but usually has pretty complex procedures to do minor things. And I forgot about .NET and the wrapper libraries around COM objects which provides very easy access. If I want to do something like that I would better use C# instead. But as it goes, I started coding in VC++. The basic logic of the code is as follows:

– Ask the user the select a directory with all the HTML files that you need to put in word
– Ask the user for the word document to which the content will be put to
– Open the specified Word document and keep a handle to the application or document object
– Open an IE Window (Invisible)
– Render the HTML file in the invisible IE Window
– Select all the contents from the IE Window and Copy it to the Windows Clipboard
– and paste the clipboard contents and save the file
– Repeat the step from rendering the HTML file onwards until all the files content are in the word doc
– Save and close the word file and the application object and release all the COM instances

Sounds a bit complicated and it is since I later realized that you can use the Word Objects InsertHTMLFile() directly instead of using Windows Clipboard which is unreliable as well as open to other rogue applications which can hang this particular application. But since I want to use it for my own purposes and not productize it, I am okay with the said approach. Again, I had almost finished my whole application until I found out that information. Wish Microsoft/Bing/Google could relate such information without me explicitly asking for it.

Anyways, so after almost writing 60% of the application, I realized that my Office 2003 installation is fucked up and that the COM Wrapper classes generated in VC++ after adding an ActiveX component for Word gives a lot of compilation errors. I tried a lot of different things including using #imports of the GUID for VBE and Word with the no_implementation keyword. None worked since removal of compilation errors would turn into runtime errors when I tried to call them in the code.

Then I decided to use a C# helper program to achieve the feat. And I decided to use Anonymous Pipes to communicate between my main VC++ app and the client C# app. I didn’t want my pipes to be used by any other program as well as wanted to limit the lifetime of the handles in case any of the apps crash. Hence I used anonymous pipes instead of named pipes. And since anonymous pipe is a half-duplex/suplex pipe, I have created easy classes to both implement a Full Duplex or half-duplex anonymous pipes depending on the needs. In most cases, one would require Full Duplex pipes to ideally do a proper communication.

Download the code from http://www.naresh.se/phpBB/viewtopic.php?f=17&t=10. Also post your follow up discussions and comments that same forum thread.

GetKeyboardState() & MouseHooks – Not available in Windows Mobile

I was writing a small piece of code for Windows Mobile in C# and wanted to use GetKeyboardState() but after a lot of searching around found out that the API is not available. Also Mouse Hooks does not work in Windows Mobile. As simple as this information seems, it took me a lot of time to search and basically validate that information. Of course the various ideas and blog posts out there are pretty confusing and does not allow you to reach to a conclusion easily.

You can find the associated code in C# at http://www.naresh.se/phpBB/viewtopic.php?f=17&t=9. The code is without any cleanups now but expect a better cleaned up and documented version later on.

To give a brief:

– Extract the files in your project directory.

– In the code where you need to install and capture key events, include a private variable GlobalHooks like shown below:

private GlobalHooks globalHooks = new GlobalHooks(Process.GetCurrentProcess().MainWindowHandle);

– After initializing your other components either in the class constructor or other methods, register for the events that you would like to handle as shown below:

            globalHooks.InstallHooks();

            // Register EventHandlers
            globalHooks.KeyDown += new KeyEventHandler(globalHooks_KeyDown);
            globalHooks.KeyPress += new KeyPressEventHandler(globalHooks_KeyPress);

– Then write the associated event handler. In the following example, I have registered for KeyDown events and am suppressing the hard key LSK events.

        void globalHooks_KeyDown(object sender, KeyEventArgs e)
        {
            Debug.Write(String.Format("KeyDown: Control: {0}, Alt: {1}, Shift: {2}, KeyCode: {3}\n", e.Control, e.Alt, e.Shift, e.KeyCode));
            if (e.KeyCode == Keys.LWin || e.KeyCode == Keys.Apps)
                e.Handled = true;
        }

– No need to mention that you need to use globalHooks.RemoveHooks() once you are done with your application, else you need a reboot !

So there you go. Enjoy and give me feedback on the forum.

District 9 – A Review…

Last night I watched the movie District 9 (D9) and I was positively happy to have invested my time in watching the movie. It deals with quite a lot of issues about human kind, its so called humanity and the plights of so called aliens. In one form or another it shows the fight that exists between one so called well to do community and another. When the story is applied in another context we  see the same behavior repeated across various factions of the world and society in general.

Let me give you a brief of the story. An alien aircraft appears on top of Johannesberg in Africa one day. People are awed and expect things with light, sound, etc. to happen. But nothing happens, so humans force their way into the ship to find a huge number (a million something) of starving, disoriented aliens. They are given shelter in an area known as District 9. The story goes from here to tell how humans were afraid of them, how some lawless aliens created problems with establishing proper relations with humans, etc. After about 20 years, a huge relocation event is planned for the aliens to move from District 9 to District 10. And everything else happens during this phase in the story. To cut short, humans who don’t know those aliens (prawns) are intimidated, terrified or consider them plain stupid and treat them like an animal.

Everybody forgets that they were a higher form of intelligence who could space travel in such large numbers as well as have advanced weaponary.  Some of the other humans ofcourse were playing on these poor creatures by selling them catfood and getting their weapons in return. But doesn’t all these just show how fickle humanity is? Time and again we have noticed it with the greeks, romans, muslims, hitler, etc. All of these at one point of time or another treated the other kind as the humans treat aliens in the movie. Humanity is just a good excuse for a chosen few to enslave and abuse others. I really felt pretty attached to the aliens/prawns after watching the movie. As well as looking at the very real face of humanity and human kind was terrifying. Makes you feel that humanity is a farce and human beings are just another species of animals without any intelligence whatsover.

It is one thing to do so against another species and it is horrific to do so against your own kind. Think about the holocaust, crusades, inquistions, etc. and it really is disappointing to see such horrors transpired on human kind by a select few individuals to satisfy the lust for power/sex for a select few individuals. Spewing venom against people who are of a different kind, different color, different race, different religion, different culture, etc. is just not human. But I see it happening even now. People in US for example are in this economic times afraid that their jobs will be taken away from them. You can find a whole lot of URLs which point to their fear. They want the H1B’s canceled and only Americans employed in America.

Well well, just musing on humanity anyways… But all in all, it is a very good movie and I am looking forward to its second part which I guess will be produced since one of the prawns went back and promised to come back in 3 years to rescue our hero and change him back to human! Didn’t I tell you that the person who is trying to evacuate district 9 accidentally spills a chemical fuel that starts the process of mutating him into the same alien race whom he was trying to evacuate? Anyways, now you know most of the story. Go ahead and watch it. You would not regret :).