<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Boschmans Account &#187; python</title>
	<atom:link href="http://www.boschmans.net/tag/python/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.boschmans.net</link>
	<description>A collection of interests and happenings...</description>
	<lastBuildDate>Wed, 01 Feb 2012 22:21:56 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Proof of concept for a simple webserver running python code</title>
		<link>http://www.boschmans.net/2012/01/31/proof-of-concept-for-a-simple-webserver-running-python-code/</link>
		<comments>http://www.boschmans.net/2012/01/31/proof-of-concept-for-a-simple-webserver-running-python-code/#comments</comments>
		<pubDate>Tue, 31 Jan 2012 21:31:27 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Blog News]]></category>
		<category><![CDATA[cherrypy]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[website]]></category>

		<guid isPermaLink="false">http://www.boschmans.net/?p=1113</guid>
		<description><![CDATA[Here is a small code example using CherryPy to run a very simple webserver that generates a simple math question compares the answer to the solution. It&#8217;s meant as a proof of concept, so there is no security built in. &#8230; <a href="http://www.boschmans.net/2012/01/31/proof-of-concept-for-a-simple-webserver-running-python-code/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Here is a small code example using <a title="CherryPy" href="http://cherrypy.org/">CherryPy</a> to run a very simple webserver that</p>
<ul>
<li>generates a simple math question</li>
<li>compares the answer to the solution.</li>
</ul>
<p>It&#8217;s meant as a <em>proof of concept</em>, so there is no security built in. It&#8217;s running on localhost on port 8888 (modifiable in the main part of the code).<br />
It allows you to play around and test out your ideas.</p>
<p><span style="color: #ff0000;">Do not use this code on an outside network !</span></p>
<p>It&#8217;s simply an example showing how easy it is to set up a web server and how you can create pages for it using python and CherryPy. It&#8217;s been cobbled together in an evening from previous programming so there&#8217;s some cruft left in. I&#8217;ve also extensively commented the code.</p>
<p>Requirements:</p>
<ul>
<li>python 2.7 ( 2.5 will work as well is my guess )</li>
<li><a title="CherryPy" href="http://cherrypy.org/">cherrypy</a> 3.2.2 ( use easy_install or pip to download and install the latest version)</li>
<li><a href='http://www.boschmans.net/wp-content/uploads/2012/01/site.py_1.txt'>site.py</a> ( the file containing the python code )</li>
</ul>
<p>You start the server in a command prompt using : <em>python site.py</em> which will start the server. Leave the command prompt open.</p>
<p>You can then visit the webserver by opening a browser and going to <a href="http://localhost:8888">http://localhost:8888</a> to see the index page and play around with it.</p>
<pre class="brush: python; title: ; notranslate">
#
# MathPoc : Proof of concept of a simple math problem, bringing it to the browser
#
# Alex Boschmans
#
# Version 0.2, February 2011
#
# 0.2 Added some error checks and expanded math to not just adding but also
# subtraction and multiplication and divisions. Extensively commented code.

header = &quot;&quot;&quot;&lt;HTML&gt;
            &lt;HEAD&gt;
                &lt;title&gt;MATH Proof of Concept&lt;/title&gt;
            &lt;/HEAD&gt;
            &lt;BODY&gt;
        &quot;&quot;&quot;
footer = &quot;&lt;/BODY&gt;&lt;/HTML&gt;&quot;

indexhtml = &quot;&quot;&quot;
        &lt;H1&gt;Math Proof of Concept&lt;/H1&gt;
        &lt;p&gt;Please answer the following question&lt;/p&gt;
                &lt;p&gt;How much is %d %s %d ? &lt;/p&gt;

                &lt;form action=&quot;/response&quot; method=&quot;post&quot;&gt;
                Answer: &lt;input type=&quot;text&quot; name=&quot;answer&quot; /&gt;
                &lt;input type=&quot;hidden&quot; name=&quot;number1&quot; value=&quot;%d&quot;&gt;
                &lt;input type=&quot;hidden&quot; name=&quot;number2&quot; value=&quot;%d&quot;&gt;
                &lt;input type=&quot;hidden&quot; name=&quot;operation&quot; value=&quot;%s&quot;&gt;
                &lt;input type=&quot;submit&quot; value=&quot;Submit&quot; /&gt;
                &lt;/form&gt;

        &quot;&quot;&quot;

def generatequestion():
    # This generates the question that we will pose using the random function
    # Generate a random question using 2 random numbers between 1 and 10
    number1 = random.randint(1,10)
    number2 = random.randint(1,10)
    # Now we choose an operatioin
    ops = [&quot;+&quot;, &quot;-&quot;, &quot;x&quot;, &quot;/&quot;]
    operation = random.choice(ops)
    # Let's check the division
    if operation == &quot;/&quot;:
        # Prevent divisions with remainders using the modulo operator
        # Using module on the two numbers evaluates to 0 when no remainder is present
        # While the modulo remainder is not equal to 0, generate two new numbers
        while number1 % number2 &lt;&gt; 0:
                number1 = random.randint(1,10)
                number2 = random.randint(1,10)
    # Assemble the html, inputting the numbers in the foreseen places in the html
    # In a more extensive project, you would keep this html in a template file and
    # call it with a dictionary of items that need to be filled in the template
    question = indexhtml % (number1, operation, number2, number1, number2, operation)
    # Add common html like header and footer - these are defined just once and reused
    # for each page
    html = header + question + footer
    # Return the completed html to the calling function (in this case index)
    return html

# This is the class that the cherrypy server uses and where you create the views that the
# webuser sees. After each definition there is a &lt;function&gt;.exposed=True that indicates if the
# webuser can see this page or not.
class MathPoc:
    def index(self):
        # This is the main index page that is shown to the user when he first visits the site.
        # We create the page by calling the function generatequestion (which is outside the class
        # MathPoc but accessible and we show it to the user by 'return'ing the page
        page = generatequestion()
        return page
        # The webuser will now see the page and will have a chance to enter an answer.
        # In the html form I've specified that the submitted result will go to the url &quot;response&quot;
        # I've added all the values I want to receive either as hidden values (eg the
        # original numbers, the operation) or as part of the form (eg the answer)
    index.exposed = True

    def response(self, answer, number1, number2, operation):
        # First check if we received an answer or if the user submitted without an answer
        if answer:
            # Calculate our own answer ourselves and generate a response to the user
            # We receive strings, so convert them to integers using int()
            number1 = int(number1)
            number2 = int(number2)
            answer = int(answer)
            # Answer is dependent on operation
            if operation == &quot;+&quot;:
                solution = number1 + number2
            elif operation == &quot;-&quot;:
                solution = number1 - number2
            elif operation == &quot;x&quot;:
                solution = number1 * number2
            else:
                solution = number1 / number2
            # See if the answer is correct and display according the result
            # Using templates, you could put all this in one template and
            # call the template with options so it knows what to show
            if solution &lt;&gt; answer:
                html = &quot;&quot;&quot;
                &lt;H1&gt;Sorry.&lt;/H1&gt;
                &lt;p&gt;The question was : %s %s %s = ?&lt;/p&gt;
                &lt;p&gt;Your answer %s is wrong. The correct answer is %d.&lt;/p&gt;
                &lt;p&gt;&lt;a href = &quot;/&quot;&gt;Try Again.&lt;/a&gt;&lt;/p&gt;
                &quot;&quot;&quot; % (number1, operation, number2, answer, solution)
            else:
                html = &quot;&quot;&quot;
                &lt;H1&gt;Correct !&lt;/H1&gt;
                &lt;p&gt;The question was : %s %s %s = %s&lt;/p&gt;
                &lt;p&gt;Your answer is correct !&lt;/p&gt;
                &lt;p&gt;&lt;a href = &quot;/&quot;&gt;Try Again.&lt;/a&gt;&lt;/p&gt;
                &quot;&quot;&quot; % (number1, operation, number2, answer)
        else:
            # We did not receive an answer
            html = &quot;&quot;&quot;
            &lt;h1&gt;Sorry ?&lt;/h1&gt;
            &lt;p&gt;You need to fill in an answer !&lt;/p&gt;
            &lt;p&gt;&lt;a href = &quot;/&quot;&gt;Try Again.&lt;/a&gt;&lt;/p&gt;
            &quot;&quot;&quot;
        # Return the page to the user, adding the common html
        return header + html + footer
    response.exposed = True

if __name__ == '__main__':
    import random
    import cherrypy
    import os, sys
    # Set the current directory - this is probably not needed for this example, cruft.
    try:
        current_dir = os.path.dirname(os.path.abspath(__file__))
    except:
        # probably running inside py2exe which doesn't set __file__
        current_dir = os.path.dirname(unicode(sys.executable, sys.getfilesystemencoding( )))

    # Set up site-wide config first so we get a log if errors occur.
    # Adding the setting 'environment': 'production' to the below turns off auto-reload.
    # Otherwise CherryPy monitors the code and any change to code reloads the server - handy for development !
    cherrypy.config.update({'server.socket_port':8888,
                            'server.socket_host':'127.0.0.1',
                            'log.error_file': 'site.log',
                            'log.screen': True})
    # CherryPy will complain of an empty config but will continue
    conf = {}
    cherrypy.tree.mount(MathPoc())
    #cherrypy.config.update({'server.socket_port':8888})
    cherrypy.quickstart(MathPoc(),'/', config=conf)
</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.boschmans.net/2012/01/31/proof-of-concept-for-a-simple-webserver-running-python-code/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Ouch! Rate limiting on twitter request.</title>
		<link>http://www.boschmans.net/2010/04/15/ouch-rate-limiting-on-twitter-request/</link>
		<comments>http://www.boschmans.net/2010/04/15/ouch-rate-limiting-on-twitter-request/#comments</comments>
		<pubDate>Thu, 15 Apr 2010 21:57:44 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://www.boschmans.net/?p=978</guid>
		<description><![CDATA[I just ran into the rate limiting of twitter. It hurts. 150 twitter requests an hour, at least for the search you can do more&#8230; Have to think how to give twitter some rest inbetween my scurrying in their databases &#8230; <a href="http://www.boschmans.net/2010/04/15/ouch-rate-limiting-on-twitter-request/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I just ran into the rate limiting of twitter. It hurts. 150 twitter requests an hour, at least for the search you can do more&#8230;</p>
<p>Have to think how to give twitter some rest inbetween my scurrying in their databases for user info&#8230; I&#8217;m gonna sleep on it.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.boschmans.net/2010/04/15/ouch-rate-limiting-on-twitter-request/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using CherryPy for webform authentication</title>
		<link>http://www.boschmans.net/2010/03/07/using-cherrypy-for-webform-authentication/</link>
		<comments>http://www.boschmans.net/2010/03/07/using-cherrypy-for-webform-authentication/#comments</comments>
		<pubDate>Sun, 07 Mar 2010 21:48:57 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[cherrypy]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[website]]></category>

		<guid isPermaLink="false">http://www.boschmans.net/?p=965</guid>
		<description><![CDATA[If you are using CherryPy, I can recommend the webform-based authentication that Arnar Birgisson wrote for ease of use and extensability. After trying out the included authentication models with CherryPy (I&#8217;m using 3.1.2, the last stable version at the moment &#8230; <a href="http://www.boschmans.net/2010/03/07/using-cherrypy-for-webform-authentication/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>If you are using CherryPy, I can recommend the webform-based authentication that <a title="Arnar Birgisson" href="http://www.hvergi.net/arnar/" target="_blank">Arnar Birgisson</a> wrote for ease of use and extensability.</p>
<p>After trying out the included authentication models with CherryPy (I&#8217;m using 3.1.2, the last stable version at the moment of writing), I was disappointed in the results. Then I stumbled over a recommandation from someone on <a title="Nabble Forum System" href="http://old.nabble.com/" target="_blank">Nabble</a>, a web-based programmar&#8217;s discussion forum, which pointed to the following wiki page on the CherryPy site:</p>
<p><a title="CherryPy Authentication and Access restrictions" href="http://tools.cherrypy.org/wiki/AuthenticationAndAccessRestrictions" target="_blank">http://tools.cherrypy.org/wiki/AuthenticationAndAccessRestrictions</a></p>
<p>The complete program code plus examples are on the page and are well explained.</p>
<p>You can have a skeleton login system (using a hardcoded dictionary) up and running in literally half an hour !</p>
<ul>
<li> Just copy/paste the code on the page and save it as auth.py in your cherrypy script dir.</li>
<li>Add the hardcoded dictionary containing username and passwords to it (or script the db access, see the example included)</li>
<li>Put &#8216;require()&#8217; everywhere on your cherrypy pages that need to have login protection &#8211; additionally, you can also have roles so that only admins can access certain pages.</li>
</ul>
<p>Early last week I replaced that hardcoded dictionary and built the db lookup query for the login. Once that was working, I added a &#8216;my profile&#8217; page to the application I&#8217;m working on.  Then I thought it would be nice for the admin to have a &#8216;create user&#8217; form in the admin section to add users. Done that as well, using the jquery-ui to create tabs and seperate content in the admin section.</p>
<p>All in all, a nice week of nice work.</p>
<p>I&#8217;m starting to think this might make it&#8217;s way to my hosting server one of the coming weeks&#8230;  although I need to do some more work on showing the user only his keywords and not all the keywords, as well as doing something with the keywords to use them better.</p>
<p>Oh and one more thing: this works better under SSL than in the clear http: sky !</p>
]]></content:encoded>
			<wfw:commentRss>http://www.boschmans.net/2010/03/07/using-cherrypy-for-webform-authentication/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Simple Python threading.Thread example using Queue</title>
		<link>http://www.boschmans.net/2010/02/03/simple-python-threading-thread-example/</link>
		<comments>http://www.boschmans.net/2010/02/03/simple-python-threading-thread-example/#comments</comments>
		<pubDate>Wed, 03 Feb 2010 22:17:23 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.boschmans.net/?p=935</guid>
		<description><![CDATA[I managed to write a really simple example of using threads in Python that I hope will give more insight on how to adapt my other programming stuff. And re-use this later on, in case I need to revisit this &#8230; <a href="http://www.boschmans.net/2010/02/03/simple-python-threading-thread-example/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I managed to write a really simple example of using threads in Python that I hope will give more insight on how to adapt my other programming stuff. And re-use this later on, in case I need to revisit this again it would be handy not to scour the internet <em>again</em> to assemble the bits and pieces of threading with Python.</p>
<p>The example below uses 3 threads, and processes 10 pairs of numbers (tuples) that I put in a list.</p>
<pre class="brush: python; title: ; notranslate">
# Our list of work todo
inputlist_ori = [ (5,5),(10,4),(78,5),(87,2),(65,4),(10,10),(65,2),(88,95),(44,55),(33,3) ]
</pre>
<p>Those numbers are divided over those 3 threads by the Queue system.</p>
<p>The Queue system itself is limited to 5 slots, although this could easily be changed to more or less. You will notice in the console print that the message &#8220;Waiting for threads to finish.&#8221; appears after the fifth result, indicating that the queues are being used and the main program has continued on.</p>
<p>After putting everything in the queue system, the program waits for the threads to finish using the .join() function.</p>
<p>All spawned threads keep on being active, running forever, accepting jobs &#8211; that is, until the queue is empty, at which point they shut down.</p>
<p>I based most of my simple example on the examples in the Python threading <a title="Python Threading tutorial" href="http://heather.cs.ucdavis.edu/~matloff/Python/PyThreads.pdf" target="_blank">tutorial</a> (.pdf) work of Norman Matloff and Francis Hsu that I referenced before in <a title="Python Threading information" href="http://www.boschmans.net/2010/01/26/using-threads-in-python/" target="_blank">a previous blog post</a>. However, while their examples undoubtedly do more and are more extensive, they are also more complex. This example is deliberately made as simple as possible so to understand the basic principles of threading and the queue system.</p>
<p>Things I stumbled over:</p>
<ul>
<li>Duh! You spawn the threads <em>before</em> you fill up the queues with stuff todo&#8230;</li>
<li>When printing out things to the console or python shell, things got jumbled because different threads took over from each other &#8211; to solve that I used the threading.Lock().acquire() and threading.Lock().release() to make sure that a thread could finish printing. Not sure if I understand completely all the possibilities this offers.</li>
<li>Still a bit stumped on getting more info, name, etc on the thread that is running at the moment &#8211; haven&#8217;t figured that out yet how to do that.</li>
</ul>
<p>Feel free to comment and ask questions &#8211; if you can improve this program, please let me know !</p>
<pre class="brush: python; title: ; notranslate">
# threading test
# Alex Boschmans
# www.boschmans.net
# January 2010

#
# IMPORT SECTION
#
import threading, Queue

#
# Variables setup
#
THREAD_LIMIT = 3                # This is how many threads we want
jobs = Queue.Queue(5)           # This sets up the queue object to use 5 slots
singlelock = threading.Lock()   # This is a lock so threads don't print trough each other (and other reasons)

# Our list of work todo
inputlist_ori = [ (5,5),(10,4),(78,5),(87,2),(65,4),(10,10),(65,2),(88,95),(44,55),(33,3) ]

#
# This is called from the main function
# It spawns the threads, fills up the queue with work items that the threads will use
# And then waits for the threads to finish
# This could use some more try:except code...
#
def draadje(inputlist):
    print &quot;Inputlist received...&quot;
    print inputlist

    # Spawn the threads
    print &quot;Spawning the {0} threads.&quot;.format(THREAD_LIMIT)
    for x in xrange(THREAD_LIMIT):
        print &quot;Thread {0} started.&quot;.format(x)
        # This is the thread class that we instantiate.
        workerbee().start()

    # Put stuff in queue
    print &quot;Putting stuff in queue&quot;
    for i in inputlist:
        # Block if queue is full, and wait 5 seconds. After 5s raise Queue Full error.
        try:
            jobs.put(i, block=True, timeout=5)
        except:
            singlelock.acquire()
            print &quot;The queue is full !&quot;
            singlelock.release()

    # Wait for the threads to finish
    singlelock.acquire()        # Acquire the lock so we can print
    print &quot;Waiting for threads to finish.&quot;
    singlelock.release()        # Release the lock
    jobs.join()                 # This command waits for all threads to finish.

#
# Main thread class - based on threading.Thread
# This class is cloned/used as a thread template to spawn those threads.
# The class has a run function that gets a job out of the jobs queue
# And lets the queue object know when it has finished.
#
class workerbee(threading.Thread):
    def run(self):
        # run forever
        while 1:
            # Try and get a job out of the queue
            try:
                job = jobs.get(True,1)
                singlelock.acquire()        # Acquire the lock
                print &quot;Multiplication of {0} with {1} gives {2}&quot;.format(job[0],job[1],(job[0]*job[1]))
                singlelock.release()        # Release the lock
                # Let the queue know the job is finished.
                jobs.task_done()
            except:
                break           # No more jobs in the queue

#
# Executes if the program is started normally, not if imported
#
if __name__ == '__main__':
    # Call the mainfunction that sets up threading.
    draadje(inputlist_ori)
</pre>
<p>Sigh. I just finished adding spaces to show where a def ends, and the damn code highlighter removed it again. Grrrrrr. If you want a copy of the code, let me know and I&#8217;ll update this post with a zipped copy of it.</p>
<p>Update: just discovered the &#8220;Syntaxhighlighter evolved&#8221; <a title="Syntaxhighligher evolved plugin" href="http://wordpress.org/extend/plugins/syntaxhighlighter/" target="_blank">plugin</a> and updated the code &#8211; indentation now works !!!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.boschmans.net/2010/02/03/simple-python-threading-thread-example/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Not using regular expressions (re or regex) to find a #hashtag (python).</title>
		<link>http://www.boschmans.net/2010/01/27/not-using-regular-expressions-re-or-regex-to-find-a-hashtag-python/</link>
		<comments>http://www.boschmans.net/2010/01/27/not-using-regular-expressions-re-or-regex-to-find-a-hashtag-python/#comments</comments>
		<pubDate>Wed, 27 Jan 2010 22:01:08 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[regex]]></category>

		<guid isPermaLink="false">http://www.boschmans.net/?p=924</guid>
		<description><![CDATA[First, a quick reminder for myself: there&#8217;s an extremely good guide to regex on Andrew M. Kuchling&#8217;s pages. Secondly, you don&#8217;t really *need* regex to parse for hashtags in a tweet &#8211; it&#8217;s a bit of overkill. The following code &#8230; <a href="http://www.boschmans.net/2010/01/27/not-using-regular-expressions-re-or-regex-to-find-a-hashtag-python/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>First, a quick reminder for myself: there&#8217;s an extremely good guide to regex on Andrew M. Kuchling&#8217;s <a title="Regex for python explanation" href="http://www.amk.ca/python/howto/regex/" target="_blank">pages</a>.</p>
<p>Secondly, you don&#8217;t really *need* regex to parse for hashtags in a tweet &#8211; it&#8217;s a bit of overkill. The following code will do as well, and was written in 1 minute after searching 15 minutes in regex how to make certain to include hyphens ( &#8211; ) and other non-characters if they are put into the hashtag.</p>
<p>The regular expression that I find works quite well for all hashtags that don&#8217;t have a hyphen in it:</p>
<pre class="brush: python; title: ; notranslate">
&gt;&gt;&gt; hashtag = &quot;This is a #hashtag #test-link #a should#not#work&quot;
&gt;&gt;&gt; x = re.compile(r'\B#\w+')
&gt;&gt;&gt; x.findall(hashtag)
['#hashtag', '#test', '#a']
</pre>
<p>So the above code correctly finds all words beginning with a hashtage, and not the ones that contain a hashtag inside the word. Note that the hyphen and the word after it is not included. </p>
<p>This is the short code I wrote that does all I want:</p>
<pre class="brush: python; title: ; notranslate">
&gt;&gt;&gt; hashtag = &quot;This is a #hashtag #test-link #a should#not#work&quot;
&gt;&gt;&gt; for word in hashtag.split():
	if word[0] == &quot;#&quot;:
		print word
#hashtag
#test-link
#a
</pre>
<p>In section 6 of the above-mentioned guide, Andrew states that in some cases string methods (like split) are faster than using regex. For simplicity, I&#8217;m going to use the latter code.</p>
<p>Update: Grrr &#8211; discovered that the tweets I am processing are in html so have href tags around them &#8211; which means ofcourse that there are no blanks for me to split words in. After another unsuccessful session with regex and just to continue I&#8217;ve used the <a href="http://www.crummy.com/software/BeautifulSoup/">BeautifulSoup html parsing library</a> to get around that by stripping out all tags and then splitting the sentence up again. Probably not as efficient as immediately using regex, I&#8217;ll have to revisit this in the future.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.boschmans.net/2010/01/27/not-using-regular-expressions-re-or-regex-to-find-a-hashtag-python/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using threads in Python</title>
		<link>http://www.boschmans.net/2010/01/26/using-threads-in-python/</link>
		<comments>http://www.boschmans.net/2010/01/26/using-threads-in-python/#comments</comments>
		<pubDate>Tue, 26 Jan 2010 22:18:10 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.boschmans.net/?p=918</guid>
		<description><![CDATA[I&#8217;ve been trying to setup threading in Python, so that in the back-end of my service system that I&#8217;m developing I can query more than one source at the same time. So instead of querying one server and waiting for &#8230; <a href="http://www.boschmans.net/2010/01/26/using-threads-in-python/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been trying to setup<a title="Threading in Python scripting language" href="http://docs.python.org/library/threading.html?highlight=thread#module-threading" target="_blank"> threading</a> in Python, so that in the back-end of my service system that I&#8217;m developing I can query more than one source at the same time. So instead of querying one server and waiting for feedback, I can launch 10 threads and thus query 10 servers and process each server&#8217;s feedback via it&#8217;s own thread.</p>
<p>So a very vague, generalising definition of a thread is an independent &#8216;process&#8217; that performs a job that you give it. You can control how many threads that you launch. Each thread is a copy of the original thread that you describe (in essence a python def function that has been wrapped in a thread class).</p>
<p>Right now, my understanding of threads is a bit confused. So far it seems that threading has several different manners of implementing them:</p>
<ul>
<li>using a number of threads that you launch, use, and forget about them (they go away)</li>
<li>improving on that by putting those threads in a thread pool, and when a thread finishes, re-using it for the next job (so you have  5 threads but 10 jobs to do, those five threads take five jobs, and the first thread that finishes takes on the sixth job, the second thread to finish the seventh job, and so on)</li>
<li>the final step seems to be (I haven&#8217;t got that far in my implementation) to set up worker bees that are managed by one thread (a better description is promised, as soon as I have understood it!)</li>
</ul>
<p>Since I&#8217;ve been scouring the net for information over threads, here is a list of resources that discuss, give examples, and explain threads &#8211; it&#8217;s useful for me to refer to, it might be useful for you as well :</p>
<ul>
<li>DaveN has <a title="DaveN, Search Engine Optimization" href="http://www.davidnaylor.co.uk/threaded-data-collection-with-python-including-examples.html" target="_blank">an extensive post</a>, with examples, building up gradually. It&#8217;s only at the end that you read that the code shown has never been run, which is a bit of a letdown. Still worth a good read though !!</li>
<li>A very thorough 25-page pdf documents that starts from the beginning is <a title="UC Davis explaining threading in Python" href="http://heather.cs.ucdavis.edu/~matloff/Python/PyThreads.pdf" target="_blank">available on the site of UC Davis</a>, University of California. It goes into all the nitty gritty details.</li>
<li>An example that uses workers in threads is found on <a title="Python threading using workers" href="http://www.taher-zadeh.com/blog/threading-in-python-tutorial" target="_blank">the blog of Danial Taherzadeh</a>.</li>
<li>Another one that discusses using multiple queues chained together can be <a title="IBM Developerworks - practical threaded programming in pytho" href="http://www.ibm.com/developerworks/aix/library/au-threadingpython/" target="_blank">found</a> on IBM&#8217;s developerworks site.</li>
<li>And the <a title="Halotis getting feeds into a db using python" href="http://www.halotis.com/2009/07/07/how-to-get-rss-content-into-an-sqlite-database-with-python-fast/" target="_blank">blog post from Halotis</a> that started my looking into threads&#8230;</li>
</ul>
<p>Right now I&#8217;m using threads in a thread pool, but I&#8217;m not doing something right &#8211; I noticed that while I have 10 jobs to do, only the first five get done, and the others &#8216;disappear&#8217;.</p>
<p>I guess the only way to get it working is to continue reading the information above <em>until it makes sense</em>. Sometimes I wonder if I&#8217;m not slightly masochistic, looking for challenges like that&#8230; ai me poor pounding head ! <img src='http://www.boschmans.net/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.boschmans.net/2010/01/26/using-threads-in-python/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>What about flex on this blogpost ?</title>
		<link>http://www.boschmans.net/2009/12/19/what-about-flex-on-this-blogpost/</link>
		<comments>http://www.boschmans.net/2009/12/19/what-about-flex-on-this-blogpost/#comments</comments>
		<pubDate>Fri, 18 Dec 2009 23:34:02 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Blog News]]></category>
		<category><![CDATA[flex3]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.boschmans.net/?p=893</guid>
		<description><![CDATA[For those few regular readers out there, they have probably noticed that I no longer post regularly about Adobe Flex. Please be assured that this is not out of the picture ! Rather, I wanted to learn Flex enough to &#8230; <a href="http://www.boschmans.net/2009/12/19/what-about-flex-on-this-blogpost/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>For those few regular readers out there, they have probably noticed that I no longer post regularly about Adobe Flex.</p>
<p>Please be assured that this is not out of the picture ! Rather, I wanted to learn Flex enough to get by in it. It&#8217;s been *very* interesting, but also very hard sometimes to wrap my head around Actionscript and MXML. Now that I know a bit about what I can do with Flex, I&#8217;ve started again with Python and more specifically with CherryPy.</p>
<p><a title="Cherry.py" href="http://www.cherrypy.org" target="_blank">CherryPy</a> is a very easy-to-use web framework that you can use to set up your own webserver in a flash. It provides a basic syntax for setting up the webservice, then scurries out of the way, letting you &#8216;get on with it&#8217;, whatever that may be.</p>
<p>Currently I&#8217;m setting up a local Webserver (using CherryPy) and this is where most of my time has gone to.</p>
<p>Once the python application on there has been created (and most of it has) I then will head back to Flex and it&#8217;s usages as a reporting tool &#8211; I&#8217;ll be trying to use <a title="PyAMF, Adobe Messaging for Python" href="http://pyamf.org/" target="_blank">PyAMF</a> as the glue between python functions and Flex datagrids.</p>
<p>Anyways, more on that later&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.boschmans.net/2009/12/19/what-about-flex-on-this-blogpost/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Feedparser.py and it&#8217;s uses&#8230;</title>
		<link>http://www.boschmans.net/2009/12/19/feedparser-py-and-its-uses/</link>
		<comments>http://www.boschmans.net/2009/12/19/feedparser-py-and-its-uses/#comments</comments>
		<pubDate>Fri, 18 Dec 2009 23:22:50 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://www.boschmans.net/?p=870</guid>
		<description><![CDATA[I recently discovered feedparser.py, a library written by Mark Pilgrim that is amazing if you want to use python to consume rss feeds. It &#8216;normalizes&#8217; the different versions of rss/atom out there into one request that you can use consistently. &#8230; <a href="http://www.boschmans.net/2009/12/19/feedparser-py-and-its-uses/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I recently discovered feedparser.py, a library written by Mark Pilgrim that is amazing if you want to use python to consume <a title="Explanation of rss feeds" href="http://en.wikipedia.org/wiki/RSS" target="_blank">rss</a> feeds. It &#8216;normalizes&#8217; the different versions of rss/atom out there into one request that you can use consistently. Doesn&#8217;t matter if it&#8217;s atom 0.1 or 0.3</p>
<p>A few links that are interesting together with feedparser.py as they show it&#8217;s usage:</p>
<ul>
<li><a title="Feedparser" href="http://www.feedparser.org/" target="_blank">feedparser.org</a> main page, which gives some examples</li>
<li>Using feedparser.py to <a href="http://www.teebes.com/blog/17/?c=18" target="_blank">query twitter</a> and putting those tweets on your website</li>
<li><a title="storing rss content in a database" href="http://www.halotis.com/2009/07/07/how-to-get-rss-content-into-an-sqlite-database-with-python-fast/" target="_blank">how to store rss content in a database</a> using feedparser.py</li>
</ul>
<p>I&#8217;m constantly amazed about the quality python code that is out there and you can just find via a simple google query. It certainly makes me think that choosing Python over, say Perl, was a good decision.</p>
<p>As for using feedparser.py to put relevant tweets on your website, note that you can also use javascript to achieve the same thing; go <a title="Twitter widget for your website" href="http://twitter.com/goodies/widgets" target="_blank">here</a> for some twitter.com goodies and an explanation on how to set this up.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.boschmans.net/2009/12/19/feedparser-py-and-its-uses/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cleaning up user input variables on the web (Python)</title>
		<link>http://www.boschmans.net/2009/12/05/cleaning-up-user-input-variables-on-the-web-python/</link>
		<comments>http://www.boschmans.net/2009/12/05/cleaning-up-user-input-variables-on-the-web-python/#comments</comments>
		<pubDate>Sat, 05 Dec 2009 19:38:12 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://www.boschmans.net/?p=873</guid>
		<description><![CDATA[Only recently I&#8217;ve discovered the power of &#8216;re&#8217; the python regular expression library. Instead of writing long functions that process text character by character to add or remove stuff, you use re, write and expression in regex that achieves what &#8230; <a href="http://www.boschmans.net/2009/12/05/cleaning-up-user-input-variables-on-the-web-python/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Only recently I&#8217;ve discovered the power of &#8216;re&#8217; the python regular expression library. Instead of writing long functions that process text character by character to add or remove stuff, you use re, write and expression in regex that achieves what you want and basta! in a few lines things get done.</p>
<p>For example the following function will remove any html tags (preventing <a title="Cross Site Scripting" href="http://en.wikipedia.org/wiki/Cross-site_scripting" target="_blank">Cross Site Scripting</a>) and <a title="Escaping html" href="http://wiki.python.org/moin/EscapingHtml" target="_blank">escape</a> the rest of whatever the user types in:</p>
<pre class="brush: python; title: ; notranslate">
# Remove html tags and escape the input
def scrapeclean(text):
----# This matches open and closing tags and what's between them
----x = re.compile(r'&lt;[^&lt;]*?/?&gt;')
----# Replace to nothing using sub and escape what's leftover and return the result all in one line!
----return cgi.escape(x.sub('',text))
</pre>
<p>Remove the dashes when you copy the code &#8211; they were added to show the necessary indentation. And for full disclosure : I took the compile statement from the following <a href="http://love-python.blogspot.com/2008/07/strip-html-tags-using-python.html" target="_blank">site</a> (I&#8217;m not a regex expert).</p>
<p>So you can call this function from somewhere in your <a href="http://www.python.org">python</a> code and the result will be &#8216;scraped clean&#8217; of all tags beginning with &lt;  and ending with &gt; plus any ampersands other other special characters get to be &#8216;escaped&#8217;.</p>
<p>YMMV &#8211; this is very likely not a complete protection against all the things a hacker can input in your website, but it&#8217;s certainly a start.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.boschmans.net/2009/12/05/cleaning-up-user-input-variables-on-the-web-python/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Python Package Manager</title>
		<link>http://www.boschmans.net/2009/11/11/python-package-manager/</link>
		<comments>http://www.boschmans.net/2009/11/11/python-package-manager/#comments</comments>
		<pubDate>Wed, 11 Nov 2009 12:35:20 +0000</pubDate>
		<dc:creator>alex</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[tools]]></category>

		<guid isPermaLink="false">http://www.boschmans.net/?p=817</guid>
		<description><![CDATA[The Python Package Manager is here, a visual tool for the python developer to find and install all the necessary packages. It shows you what is already installed on your system, with the option to deinstall the packages, and by &#8230; <a href="http://www.boschmans.net/2009/11/11/python-package-manager/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<div id="attachment_848" class="wp-caption alignnone" style="width: 135px"><a rel="attachment wp-att-848" href="http://www.boschmans.net/2009/11/11/python-package-manager/python-package-manager-logo/"><img class="size-full wp-image-848" title="Python-Package-Manager-Logo" src="http://www.boschmans.net/wp-content/uploads/2009/11/Python-Package-Manager-Logo.PNG" alt="Python Package Manager Logo" width="125" height="140" /></a><p class="wp-caption-text">Python Package Manager Logo</p></div>
<p>The <a title="Python Package Manager" href="http://sourceforge.net/projects/pythonpkgmgr/" target="_blank">Python Package Manager</a> is here, a visual tool for the python developer to find and install all the necessary packages.</p>
<p>It shows you what is already installed on your system, with the option to deinstall the packages, and by typing into the search box you can find additional packages and install them, all graphically. It&#8217;s supposed to be cross-platform, but the <a title="Python Package Manager homepage" href="http://www.preisshare.net/pythonpkgmgr/" target="_blank">homepage</a> of the developer only provides a windows download option.</p>
<p>I just hope this gets used and keeps being supported, as it is a lot handier than using the command line ! I do think you still need easy_install and wxpython/wxwidgets though&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.boschmans.net/2009/11/11/python-package-manager/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

