Distinguishing legitimate software from malicious software is a problem that requires a lot of expertise. In order to create a malware detection software, an approach consists in extracting System Call Dependency Graphs (SCDG) which summarize the behavior of a software. Once SCDGs are extracted, a learning phase identifies sub-graphs characteristic of malicious behaviors. In order to classify the graph of an unknown binary, we look whether it contains such sub-graphs. These techniques proved to be efficient, but no analysis of the sub-graphs extracted during the learning phase has been conducted so far. We study the sub-graphs we find and showcase preprocessing steps on the graphs in order to improve the learning and classification performance. The approach has been applied on graphs extracted from Mirai. Mirai is a malware which created a large botnet in 2016 to perform distributed deny of service attacks. We show that the preprocessing step tremendously improve the speed of the learning and classification.