Saturday 23 March 2013

Bad Python Programmer

All my Python stuff will now be on http://badpython.blogspot.co.uk/

Another Attempt At Programming


This is a project I started years ago while I was at collage using Delphi Developer 2.0. I think I was doing fairly well. Although I was having trouble with the sheer amount of data this project can generate.

So what is it? Well it's a program intended to narrow the odds of winning the lottery by eliminating combinations of numbers already drawn and using the historical data to predict the next draw.

The last time I attempted this I tried to do it all as one big program. This time I'm breaking it down into little bits. One program for each element that does one thing and does it well.

The first task is to generate all possible combinations of numbers, ordered sequentially. It might turn out I don't really need this part. But it gets me started. I've kept a diary of my progress so far. Which will be published in this post. I think I'll also start a blog specifically for this project.


Project
Lotto Predictor
Module Name
lotto-sng-sqldatabase.py lotto-sng-txt.py
Author
Kevin Lynch
Created
17.03.2013



Brief
Write a program that generates all sequential combinations of numbers in a given range.

  • The program must save the resulting output to an sql database.


17.03.2013: Project Established

lotto-sng-sqldatabase.py” is a program designed to generate every possible sequential number combination for a given Camelot lottery draw game. The games primarily being targeted are the Lotto, Euro Millions and Thunderball.

The draw lines generated will be saved to an SQL database for later analysis by a different module.

Standard modules being imported include;

  • os – helps with hos OS functions.
  • sys – helps with host os functions.
  • termios – helps with host os terminal functions.
  • fcntl - helps with host os terminal functions (I think … not entirely sure).
  • struct – helps with data structure functions.
  • string – helps with string manipulation functions.
  • apsw – SQL wrapper for Python.

Each game type will be defined as a class in it's own right based on a generic class. This should help to avoid duplication of code while at the same time allowing game-specific modifications.

  • class rootg(): – is the generic game object.
  • class lotto(): – is the Lotto game object.
  • class eurom(): – is the EuroMillions game object.
  • class thund(): – is the Thunderball game object.

All game specific objects need to pass game specific data to rootg objects. Specifically the number of main sequence numbers, special numbers (bonus ball, thunder ball, lucky stars) and the upper and lower limitations on these number groups.

Phase 1 in developing this module will be to simply get the program to generate the number sequences and print them to the terminal window.

Phase 2 will be the development of the sql database and directing output to the database.

Phase 3 will be a final polish and will not be considered essential.
17.03.2013: Basic Structure Established

The four main classes have been established as stubs along with the main “while” loop and a stub function for the main menu. Three other stubs were added. “do_lotto”, “do_eurom” and “do_thund”. These are intended for program flow control. However they will likely be removed. I think the class structure will probably provide enough flow control.

A final exit message was also added primarily as a test for the main “while” loop. Don't judge me, I'm paranoid!

17.03.2013: SNG Code Prototyped

SNG code has been prototyped as a function. The next step is to convert it to a class object and optimise it as a generic object so that the same code can be utilised for all targeted games.

It may also be necessary to periodically save generated sequences to file. The Python list object currently takes up 800MiB of RAM and counting with the third column at 31. It's maximum is 46.

17.03.2013: Proper Code Name Being Considered

Toutatis, the Celtic god of protection, war and wealth is being considered as a code name for the project. It needs a proper code name.
 
18.03.2013: Duplication Checking Added

Tests added to check for duplicate lines and numbers. Currently suppresses output. May need to do this the long hard way.

18.03.2013: Duplication Checking Added Again

A simple way to remove duplicates found. But it has an odd effect on main ball 6. It prevents it from reaching it's maximum of 49 and duplicates every line when main ball 6 has a value of 48.

18.03.2013: Reworked Number Line Generation And Duplication Removal

Method of generating number lines reworked with some basic duplication removal. Some anomalies still show up. It might better to remove these when the final list has been produced. Although this list will be over 2GiB in size.

I'm now using a single “while” loop were I was using nested “for” loops before. The code is much cleaner, more compact and should be adaptable to a generic game object. The reworked code is also exceptionally fast, producing hundreds of thousands of combinations in seconds.

Two small utility functions were also created. “inc(v,line)” and “dec(v)”. “inc(v,line)” increments “v” and checks for it's existence in line. It continues until if finds a value for “v” not already in “line”.

dec(v)” subtracts 1 from v.

20.03.2013: While Does Not Stop

The “while” loop used to generate number combinations does not stop at the predicted final combination. Meaning the stop condition test has failed. This will need to be reworked again.

Upper limit enforcement will also need to be implemented. Ball 6 is some how incrementing to 50.

20.03.2013: The Bonus Ball Is The Root Of All Evil

It turns out generating the main line numbers and the bonus ball in one step is the root of all the problems so far. Removing the bonus ball from the equation allows the stop test condition to kick in and stop the “while” loop.

There's really no particular need to generate the bonus ball. So I will be ditching that for now.

Using “if v not in line:” is still allowing the odd anomalous result. I think I will be ditching “not in” in favour of writing my own function.

20.03.2013: Duplication Issues Resolved!

Resolving duplication of individual values and entire lines has been an issue throughout the project. I'm now confident this problem has been resolved. I have done this by wring my own function to replace “not in” and by ensuring the value in the column to the right is always greater than the value in the column to the left.

I believe this will prevent duplicate lines and values while still generating the full range of unique combinations that would be valid in the UK National Lottery.

20.03.2013: Change Of Plan

The original plan was to have one program that could do everything. I now think it would be better to create smaller utility programs to do the one-off tasks. The main program can always call these utility programs if need be. This should simplify development.

The program has been cleaned up to reflect this change of plans. Using Python print formatting and the tee utility at the terminal, output can now be saved to a comma delimited file. However the final utility will still likely build an SQL database.

20.03.2013: Return Of The Failure To Stop Bug

The program has suddenly started failing to stop again. I can't figure out why. Balls 2,3,4,5 and 6 burst their limits by 1.

20.03.2013: Stop Bug Resolved … Again!

Problem resolved again by adding extra checking to “inc”.
21.03.2013: First Perfect Set Of Results

I've just checked the results this morning and they appear to be perfect. The program stopped when it should have and there don't seem to be any anomalous results. The final file is 152 MiB in size. This is a little worrying as I'm fairly certain I used to get a file GiB in size with Delphi. However at first glance everything seems to be present and correct.

The next step is to write a program that can load two text files and compare the contents. When a matching line is found it will be marked “VOID”.

22.03.2013: Working With Text Files

I've decided to stick to working with text files for the time being. Writing small programs that do one thing and one thing only seems to be the best way forward at the moment until I get a better grasp of programming with Python. So with that in mind “lotto-sng-sqldatabase.py” will become “lotto-sng-txt.py”.

This program does not generate the final text file, but rather generates text output to the terminal. The output must be redirected to a text file using piping or bifurcation. Which is virtually universally supported in Linux distributions. Mac and Windows terminals also support output redirection. Not that these platforms will be tested. They're not a priority.

22.03.2013: “lotto-sng-txt.py” Usage

Simple output to terminal command:

./lotto-sng-txt.py

Redirected output command:

./lotto-sng-txt.py > whatever-file-name-you-like

Bifurcated output command:

./lotto-sng-txt.py | tee whatever-file-name-you-like


22.03.2013: Final Code Listing For “lotto-sng-txt.py”

 
#!/usr/bin/env python
# Project : Lotto Predictor
# Module Name : lotto-sng-txt.py
# Author : Kevin Lynch
# Created : 17.03.2013
# Brief: Write a program that generates all sequential combinations of numbers in a given range.
# Function Definitions:
def inc(v,line,m):
    go = True
    while go == True:
       if v < m:
          v = v + 1
             if v != line[1]:
                if v != line[2]:
                   if v != line[3]:
                      if v != line[4]:
                         if v != line[5]:
                            if v != line[6]:
                               go = False
       else:
          return v
    return v
def do_lotto():
    god = [True,[1,44],[2,45],[3,46],[4,47],[5,48],[6,49]]
    vline = ["*",1,2,3,4,5,6]
    print "%02d,%02d,%02d,%02d,%02d,%02d" % (vline[1],vline[2],vline[3],vline[4],vline[5],vline[6])
    while god[0] == True:
       if vline != ["*",44,45,46,47,48,49]:
          if vline[6] < god[6][1]:
             vline[6] = inc(vline[6],vline,god[6][1])
          elif vline[5] < god[5][1]:
             vline[5] = inc(vline[5],vline,god[5][1])
             vline[6] = inc(vline[5],vline,god[6][1])
          elif vline[4] < god[4][1]:
             vline[4] = inc(vline[4],vline,god[4][1])
             vline[5] = inc(vline[4],vline,god[5][1])
          elif vline[3] < god[3][1]:
             vline[3] = inc(vline[3],vline,god[3][1])
            vline[4] = inc(vline[3],vline,god[4][1])
          elif vline[2] < god[2][1]:
             vline[2] = inc(vline[2],vline,god[2][1])
             vline[3] = inc(vline[2],vline,god[3][1])
          elif vline[1] < god[1][1]:
             vline[1] = inc(vline[1],vline,god[1][1])
             vline[2] = inc(vline[1],vline,god[2][1])
          print "%02d,%02d,%02d,%02d,%02d,%02d" % (vline[1],vline[2],vline[3],vline[4],vline[5],vline[6])
          else:
             god[0] = False
# Main Program:
    do_lotto()


So there it is. The first component is way simpler than I first imagined and planned it to be. But with the tee command available to create the text file for me I see no reason to duplicate this function at this early stage. Now I just need to do the elimination part and the prediction part. Which will be really easy. ... :(

Note to Blogger developers. Is there any real reason why Blogger can't retain tabbed indentation?