the classpath. Sounds simple enough and according to the docs

all you need to do is setup an environment variable called

`log4j.configuration`

. The tricky part, which is easy to miss, is that the path needs to be given as a URI, so for example you may want to run your app like this:
java -Dlog4j.configuration=file:\some\path\log4j.xml -jar myJar.jar

]]>

Each line of an access log file looks like this:

so the first thing we need to do is to read the data into our R session. This is a bit tricky as timestamps have whitespaces separating the time zone:

ReadLogFile <- function(file = log.file) { # http://en.wikipedia.org/wiki/Common_Log_Format access_log <- read.table(file, col.names = c("ip", "client", "user", "ts", "time_zone", "request", "status", "response.size")) access_log$ts <- strptime(access_log$ts, format = "[%d/%b/%Y:%H:%M:%S") access_log$time_zone <- as.factor(sub("\\]", "", access_log$time_zone)) access_log$status <- as.factor(sub("\\]", "", access_log$status)) access_log } log.file <- "access_log" access_log <- ReadLogFile(log.file)

Now let’s plot the server load:

library(ggplot2) library(colorspace) ggplot(access_log, aes(x = ts)) + geom_density(stat = "bin", binwidth = 3600, colour = "black", fill = "darkgreen") + ylab("Requests/hour") + xlab("Time")

so a clear user activity pattern emerges – a lot more users during working hours and very few at nighttime.

A quick analysis of the the response codes reveals that there were some 5** errors:

df <- as.data.frame(table(access_log$status)) colnames(df) <- c('status','freq') ggplot(df , aes(x="",y=freq, fill=status) )+ geom_bar(position= "fill",alpha=0.8,width=0.5, stat="identity")+ scale_fill_manual (values=rainbow_hcl(9, start = 200, end = 20))+ coord_flip() + xlab("") + ylab("Frequency")

Let’s take a closer look:

server.errors <- grep('5..',access_log$status) ggplot(access_log[server.errors,], aes(x=status)) + geom_histogram(colour="black", fill="red")

And finally: how the errors were distributed over time?

ggplot(access_log[server.errors,], aes(x=ts)) + geom_density(stat='bin',binwidth=3600, colour="black",fill="red") + ylab('Errors/hour') + xlab('Time')

]]>

The game we’re dealing with is not very complicated to say the least and even if it was more challenging there are specialised methods just waiting to be used in such situations. So rolling out a neural approach here is like cracking a nut with a sledgehammer… that being said let’s get down to business!

First of all an artificial neural network is a biologically inspired computational model basically trying to mimic the way a real live brain works. When somebody approaches you in a dark corner and asks “Pssst! Wan’na see a neural network?” then he most probably will show you something like this:

.

Such a network is actually a multilayer perceptron with the three layers of neurons (circles) connected by weighted links (arrows). It works by receiving some inputs delivered to the input layer, processing them using the neurons and returning output through the output layer.

Initially such a network is pretty useless until its trained. Training basically means trying to figure out how the weight values should be set. So how do we teach one to play tic-tac-toe?

Let’s start from the beginning and define an empty board of a given size and write a function (in R) which evaluates whether somebody has won:

board.size <- 3 # The game board is a matrix with: 1 - tic, 0 - empty, -1 - tac GenerateEmptyBoard <- function(){ matrix(0, ncol=board.size, nrow=board.size) } DisplayBoard <- function(board){ b <- factor(board, levels=c(-1,0,1),labels=c("X","*","O")) dim(b) <- dim(board) b } FlipMatrix <- function(m){ apply(m, 2, rev) } # Checking whether somebody has won EvaluateBoard <- function(board){ sums <- c(colSums(board), rowSums(board), sum(diag(board)), sum(diag(FlipMatrix(board))) ) if(max(sums) == board.size){ return(1) } if(min(sums) == -board.size){ return(-1) } 0 }

Now let’s create functions for our neural network to let it play the game. A reasonable approach is to assume that the network will be taught to evaluate the board situation. Is such case when the computer will be making its move it will use the neural network to evaluate each possible move and choose the best one.

neurons <- 1 # Creating an empty neural network which we represent as a matrix of weights InitNN <- function(){ n.weights <- neurons*board.size^2 + 2*neurons matrix(runif(n.weights),nrow=neurons) } # Calculating the network RunNN <- function(nn, board){ w.out <- nn[,1] w0.in <- nn[,2] w.in <- nn[,3:ncol(nn)] t(w.out) %*% tanh(w.in %*% as.vector(board) + w0.in) } # Evaluating every move possible using the network RunAI <- function(ai, board, side){ res <- sapply(1:length(board), function(i){ b <- board b[[i]] <- side RunNN(ai,b) }) # We don't want the AI to cheat res[board!=0] = -Inf # We choose the best one which.max(res) } Move <- function(ai, board, side){ move <- RunAI(ai, board, side) board[[move]] <- side board } IsMovePossible <- function(board){ length(which(board==0)) > 0 }

So now we have all we need to make a neural network play tic-tac-toe, but it won’t be very good at it – until we teach it. One way to do it it to sit in front of the computer for a couple of weeks, play a game after game against against our pet network until it is worthy… or: behold the Random Player!

NNvsRandomPlayer <- function(tic.ai){ side <- sample(c(-1,1),1) board <- GenerateEmptyBoard() eval <- 0 while(eval == 0 && IsMovePossible(board)){ if(side==1){ board <- Move(tic.ai, board, side) } else{ # Make a valid move completely at random and see what happens move <- sample(which(board==0),1) board[[move]] <- side } eval <- EvaluateBoard(board) side <- side * -1 } eval }

So now we have a worthy opponent let’s see how our network does in a series of 10,000 games:

Hmm… not very good, but let’s train it:

TrainAI <- function(){ # Function to evaluate how good is our network Eval <- function(w){ tic.ai <- matrix(w, nrow=neurons) # Playing against a random player is nondeterministic, so we need to stabilise the results ev <- median(sapply(1:20,function(j)mean(sapply(1:20, function(i)NNvsRandomPlayer(tic.ai))))) ev <- -1*(ev) } len <- length(InitAI()) # This is a global optimisation method, so using we need an appropriate method - a differential evolution # algorithm seems sufficient res <- DEoptim::DEoptim(Eval, rep(-0.1,len), rep(0.1,len), DEoptim::DEoptim.control(trace=1, parallelType=0, NP=10, VTR=-1.0)) matrix(res$optim$bestmem, nrow=neurons) }

After the training our impressive single-hidden-neuron-network becomes better (again 10,000 games):

It works! On the other hand you could probably implement a better “AI engine” for tic-tac-toe using a couple of `if`

‘s but that’s a whole different story…

]]>

of your calculations directly from R. This is rather a straightforward task when using commands like

`latex`

.
One place when this gets a bit awkward is when you want to include some math in e.g. column names.

A simple trick that seems to work pretty well is to map your column names just before generating the latex table.

This can be done using the hash package like so:

require(Hmisc) require(hash) # Some data for our table x <- 1:5 df <- data.frame(x=x, x2=x^2,x3=x^3, x0.5=sqrt(x)) # First version - without mapping a <- latex(df ,rowlabel="",file="table.tex") # Second version - column names mapped using a hash object map <- function(vals){ col.map <- hash( x = "$x^1$", x2 = "$x^2$", x3 = "$x^3$", x0.5 = "$\\sqrt x$") lapply(vals, function(v)col.map[[v]]) } colnames(df) <- map(colnames(df)) a <- latex(df, rowlabel="", file="table.mapped.tex")

After including the generated tex files in some Latex document the table without the mapping looks like this

… and using mapped column names we have:

]]>

One way of adding math symbols to your plots is using the `expression`

function. However if you are planning to use your plots as part of an article or report written in Latex, then in my opinion it is actually better to use the tikz graphical device. This may be a bit more complicated at first but with tikz you can use the same syntax both in the report text and on the plots so in the long run it is worth the effort.

So here’s the code:

require(tikzDevice) # Start the tikz device tikz('phi_plot.tex', standAlone = TRUE, # We want a tex file that can be directly compiled to a dvi width = 6, height = 6, packages=c(options()$tikzLatexPackages, "\\usepackage{amsfonts}")) # Prepare your wonderful plot x <- seq(-2,2,length=100) plot(x, x^2+rnorm(100,sd=0.1), asp=1, xlab="x",ylab="$\\phi(x)$", main = "Plot of the $\\phi : \\mathbb R \\rightarrow \\mathbb R$ function defined as $\\phi(x) = x^2 + \\varepsilon$") # Turn the device off dev.off(); # Convert the latex file to a pdf tools::texi2pdf('phi_plot.tex',quiet=FALSE)

If all goes well then `phi_plot.pdf`

should look like this :

There are however some possible pitfalls.

- Notice how the amsfonts package is included – if we would simply try to do this :
tikz('phi_plot.tex', standAlone = TRUE, # We want a tex file that can be directly compiled to a dvi width = 6, height = 6, packages=c("\\usepackage{amsfonts}"))

then the result would most probably be

! ==> Fatal error occurred, no output PDF file produced! Błąd w getMetricsFromLatex(TeXMetrics) : TeX was unable to calculate metrics for the following string or character: m Common reasons for failure include: * The string contains a character which is special to LaTeX unless escaped properly, such as % or $. * The string makes use of LaTeX commands provided by a package and the tikzDevice was not told to load the package.

this is because the included package actually overrides the packages set in tikz options so the original option value also needs to be included.

- If something bad happens and the tikz device doesn’t get switched off then the next attempt to convert the tex file to a pdf

may end complaining that the`texi2dvi`

returned error code 1. In such case try to turn the tikz device off (`dev.off()`

)

and just in case put the entire clause in a tryCatch and the`dev.off()`

in the finally clause.

]]>