Thursday, June 30, 2011

TopCoder SRM 165 Div 2 500-pt

We are trying to see which number of processors requires the least amount of time to process K tasks. We are told it takes 1 ms for 1 processor to process 1 task. So it takes N processors K / N ms to process K tasks. But parallelization requires time to sync the processors. We are told it takes OVERHEAD ms per pair of processors to sync up initially. How many pairs are there? There are n choose 2 pairs, which is n!/(2!(n-2)!). So we can calculate the total time for N processors as (n choose 2)*overhead + K / N. Ok, so this is easy, we just loop through and plug in a different N every iteration, and then return the best one. But don't forget, n! can become a very big number, so you have to protect against overflow. There's two ways: 1) use a bigger type, like long or BigInteger 2) Notice that with n!/2!(n-2)! you can cancel out alot of the top numbers because they appear on the bottom. I did 1. Another thing to worry about is the fact that computing n! is a recursion that computes the value of n*(n-1)... The thing to notice about this is that n! is actually n * (n - 1)!. Well since your main loop is going from 2 to n, you've already computed the value of (n - 1)! in the last loop. All you need to do is store the result in a hash table and look it up instead of recomputing everytime. Instead of doing N deep recursion through every loop, it's reduced to looking up the previous value in a hash table, or O(1). Another optimization is the fact that the function has a minimum point (the shortest time). So once you find that you can return. In fact, adding that last optimization my execution time from ~50 ms to ~5 ms.

import java.util.HashMap;
import java.math.BigInteger;
public class ParallelSpeedup
    HashMap<Integer, BigInteger> facts = new HashMap<Integer, BigInteger>();
    public int numProcessors(int k, int overhead)
        //(n!/(2!(n-2)! x overhead + k/n is the function
        BigInteger bestTime = BigInteger.valueOf(k);
        BigInteger prevTime = BigInteger.valueOf(k);
        int bestN = 1;
        int diff = 0;
        for(int n = 2; n < 1000; n++)
            BigInteger newTime = getTime(n, k, overhead);
            //check if we passed the minimum
            if(newTime.compareTo(prevTime) > 0)
                return bestN;
            diff = newTime.compareTo(bestTime);
            if(diff < 0){
                bestTime = newTime;
                bestN = n;
            prevTime = newTime;
        return bestN;
    private BigInteger getTime(int n, int k, int overhead)
        BigInteger nchoose2 = fact(n).divide(fact(2).multiply(fact(n-2)));
        BigInteger perPro = BigInteger.valueOf((int)Math.ceil((double)k / (double)n));
        return nchoose2.multiply(BigInteger.valueOf(overhead)).add(perPro);
    private BigInteger fact(int n)
        if(n <=1)
            return BigInteger.valueOf(1);
        else if (facts.containsKey(n))
            return facts.get(n);
            BigInteger n1 = BigInteger.valueOf(n).multiply(fact(n-1));
            facts.put(n, n1);
            return n1;

Sunday, June 26, 2011

Multi-threaded Quicksort using Java

NOTE: There was a bug in the initial code. The insertion sort portion had for(int i = left; i < right; i++), but right is already an index, not a size, so we need to change < to <=.

 I'm in the process of learning parallel programming. The first thing I wanted to do to learn it is to actually use threading in Java. So I created a multi-threaded version of quicksort. I compared it to single-threaded quicksort. Running it with 10 million elements i got 1300 ms for multi-threaded, and 1991 ms for single-threaded running on a quad-core processor.

Notice that insertion sort is done if the size of the array is less than 20. Two reasons:
1. Insertion sort is better on small amounts of data compared to quick sort
2. This reduces the number of threads spawned.

Here's the code...

//Use it by creating the QuickSorter object and calling start().
    public static void main(String[] args) {
        ArrayList<Integer> arr = new ArrayList<Integer>();
        for(int i = 10000000; i > 0; i--)
        ThreadGroup group = new ThreadGroup("QuickSorter");
        Thread t = new QuickSorter(arr, 0, arr.size() - 1, group);

public class QuickSorter extends Thread{
    private ArrayList<Integer> arr;
    int left;
    int right;
    public QuickSorter(ArrayList<Integer> arr, int left, int right, ThreadGroup g)
        super(g, "QuickSorter");
        this.arr = arr;
        this.left = left;
        this.right = right;
    public void run()
        if(left >= right || left < 0 || right >= arr.size())
        if(right - left < 20)
            for(int i = left; i <= right; i++)
                int tmp = arr.get(i);
                int j = i;
                for(; j > 0 && tmp < arr.get(j - 1); j--)
                    arr.set(j, arr.get(j - 1));

                arr.set(j, tmp);//[j] = tmp;
        int pivotIdx = (right + left) / 2;
        int pivot = arr.get(pivotIdx);
        int i = left;
        int j = right;
        while(i < j)
           while(arr.get(i) < pivot)
           while(arr.get(j) > pivot)
           if(i >= j)
           int tmp = arr.get(i);
           arr.set(i, arr.get(j));
           arr.set(j, tmp);
        QuickSorter leftQS = new QuickSorter(arr, left, i - 1, this.getThreadGroup());
        QuickSorter rightQS = new QuickSorter(arr, i + 1, right, this.getThreadGroup());

Tuesday, June 21, 2011

How to strip out the time of a DateTime and set it to what you want in SQL Server

Let's say I have two parameters @StartDate and @EndDate. I want to get all records with modifications between these DateTimes, but only the date part matters. If I don't know for sure what the Time part of the parameters are then i'm going to get bad results. So the best solution is to strip out the current time and add the time i want. In this case i want @StartDate's time to be 00:00:00 so it includes all records in that day, and @EndDate's time to be 23:59:59 so it includes all possible values in that day.

So here's how do that
SET @StartDate = DATEADD(dd, DATEDIFF(dd, 0, @StartDate),0)
SET @EndDate = DATEADD(dd, DATEDIFF(dd, 0, @EndDate),0) + '23:59:59'

The alternatives to this are not ideal. For example, you could use DATEPART and compare the year, month, and day of the record to the parameter. This means for every record you're executing DATEPART 3 times (i think it would cache the dateparting of the parameter, since it's on the right side, but I'm not entirely sure). That's going to be really slow compared to the solution above.

how it works
Start from the inside out. DATEDIFF(dd, 0, @StartDate) returns the total number of days since the minimum date ("0") in the system. So let's call this DAYS. The next part is DATEADD(dd, DAYS, 0). This is adding the number of days to the minimum date, thus stripping out the time.
There was an error in this gadget