哈夫曼树的大部分内容都在上一篇博客中讲解了,在这里重点说一下如何统计的文本数据。
基本思想:首先从文件中读取数据,按行读取,保存在String类型的遍历中。
在统计信息的时候,使用的数据结构是HashMap,字符作为key,个数作为value。
为了能与我之前写的程序相结合,需要将HashMap的数据转换成Huffman结点类型数据。然后按照之前的程序就可以操作了!

读取文本:

在public String getTextFromTxt (String filePath) {
		String text = "";
		try {
			FileReader fileReader = new FileReader(filePath);
			BufferedReader bufferedReader = new BufferedReader(fileReader);
			
			String str = "";
			while ((str = bufferedReader.readLine()) != null) {
				text += str;
			}
		} catch (FileNotFoundException e) {
			System.out.println("Not found this file!");
		} catch (IOException e) {
			System.out.println("I/O Error!");
		}
		return text;
	}

根据文本内容得到统计信息:

public HashMap<Character, Integer> text2HashMap (String text) {
		HashMap<Character, Integer> hashMap = new HashMap<Character, Integer>();
		for (int i = 0; i < text.length(); i++) {
			if (hashMap.containsKey(text.charAt(i))) {
				hashMap.put(text.charAt(i), hashMap.get(text.charAt(i))+1);
			} else {
				hashMap.put(text.charAt(i), 1);
			}
		}
		return hashMap;
	}

将HashMap中的数据转换成Huffman结点类型:

public HuffmanNode[] hashMap2HuffmanNode (HashMap<Character, Integer> hashMap) {
		// 遍历hashmap,并构造nodelist
		HuffmanNode[] nodeList = new HuffmanNode[hashMap.size()*2-1];
		
		int i = 0;
		for(Map.Entry<Character, Integer> mm : hashMap.entrySet()) {
			nodeList[i++] = new HuffmanNode(mm.getValue(), mm.getKey());
		}
		
		// 初始化从i到最后的nodelist
		for (int j = i; j < nodeList.length; j++) {
			nodeList[j] = new HuffmanNode(-1, '#');
		}
		
		return nodeList;
	}

运行实例及运行结果
文本内容:

Beauty. It' s a word that can build people up to the highest levels of confidence and drag them down just as quickly. 
Standards of beauty have been noted for years, but the concept of it is so subjective that I find it hard to believe that
 so many industries market themselves on those so called standards. As you travel the world, these standards actually change. 
What might be considered the epitome of beauty in one country may be considered average or even ugly in another. And these 
standards also change over time. Decades ago, the classic definition of "beautiful" was more curvaceous. This trended into a waifish 
look. We have transitioned from that to what is considered "healthy" which places an emphasis on slim but muscular physique. We 
all have our own opinions and we' re going to travel the globe and see just how diverse beauty really is. ITALY Italian women have long been 
considered beautiful. But the standards are more or less based on taking care of themselves and maintaining a natural beauty. It’s about what 
makes them feel good. They take care of themselves and maintain their skin and hair. Although they enjoy the sun, they do it responsibly and use 
good sunscreen. Proper diet also helps maintain gorgeous skin and hair. Italians embrace life and enjoy all aspects of it. They believe in embracing 
what comes. They are fit and comfortable with themselves. Italy shows that beauty is more than skin deep and truly embraces the diversity that is out there.
FRANCE France is known for being fashion forward, but when it comes to the standard of beauty, it’s all about natural beauty. French women tend toward 
the clean, natural appearance instead of makeup. They actually prefer the bare, unmade up face that shows what they actually look like. They celebrate the 
true beauty and embrace what they were born with. French women enjoy the quirky and distinct features without the use of contouring and makeup. They 
go for a more understated look compared to Americans. French women are known to be glamorous and graceful at any age. They maintain a flawless 
complexion throughout their lives. KOREA Korean women are admired the world over for their incredible skin. If you think of the standard for gorgeous 
complexions, Korea ranks at the top. These women work very hard to maintain flawless, pale, porcelain skin. Their creamy, alabaster faces are used to sell 
skin care all around the globe. This standard of beauty originates from a time when a pale complexion was a sign of wealth and social standing. Tan skin 
was a sign of someone who had to work outside and couldn' t afford the luxury of skin care. They also place a high importance on more rounded eyes and 
will even go through surgery to achieve what is considered ideal. GREAT BRITAIN The British have a flair for the understated elegance. This shows in practically
 every aspect of their culture, including beauty. British women tend to prefer a more aristocratic look of slenderness, pale skin and maybe freckles. They tend to 
prefer comfortable and simple clothes that are more understated as well as a minimal look with makeup. British women also don' t worry as much about getting 
older. The wrinkles and signs of age that other women struggle to hide don' t bother them. They view these changes as signs of maturity and wisdom. Britain certainly 
has a healthier way to view standards of beauty at any age. BRAZIL Brazil is a land of exquisite beauty and it' s women live up to it. Unlike a lot of countries that 
place an emphasis on slenderness, the Brazilians have a different idea in mind. They embraced the look of a more curvaceous body for years. Women were admired 
for having muscular legs, wider hips and a large buttocks. Unfortunately, like so many other countries, the Western standards of beauty have started seeping into the 
country and causing a shift toward thinner bodies with larger breasts. Hopefully, this unbelievably diverse and gorgeous country will start to enjoy it’s own unique styles
 and looks. INDIA Clothing is a huge part of the Indian beauty culture. Women use henna, saris, and bindis to show femininity. These colourful adornments are a huge part 
of celebrations such as weddings. For many years, a thin body was not considered to be beautiful or healthy. Standards were placed on a healthier body. But Western ideals 
have started to show in India, bringing a slimming down. Indian women are known for their unbelievably gorgeous hair. Indian women spend a great deal of money on hair
 products and treatments to maintain the thick, glossy manes they are known for all over the world. AUSTRALIA Due to the endless sun that Australians are blessed with, 
many of the women have a sun kissed look. They do practice a great deal of skin care due to the extreme levels of solar radiation that the country deals with. The heat 
that Australia is known for tends to bring about a healthy, outdoors look that shows a fit and healthy body. This is hardly surprising from a country that is known for 
outstanding beaches, diving and other water sports. IRAN For a culture that still embraces having women staying covered up, Iran is the country that leads the world in 
rhinoplasty. It' s considered a sign of affluence and is almost expected. Women wear the bandages that cover their sculpted noses long after they are no longer necessary 
and wear them proudly. There are even stories of people buying the bandages and wearing them even though they didn't get the surgery.
With women spending so much time veiled, they embrace the use of surgery on one of the few body parts that they can show. 
But despite the strict laws in Iran, many women wear bright colours and have dyed hair underneath their veils as they try to balance between 
fashion and cultural beliefs. ETHIOPIA Ethiopian women have a long standing practice that is linked to beauty and marriage. Girls in the Karo tribe receive scars 
that are cut into their stomachs by the elders of the tribe. These are considered marks of beauty and desirability. Once the scars are placed, the girl is allowed to marry. 
The girls accept this practice and feel it will attract a good husband for them so they can get married and start a family.
Many cultures have scarring as part of their history and place a great deal of emphasis on it. What may be considered extreme by some will be considered the epitome 
of beauty by others. CHINA Due to changes in climate and political upheaval, a common trait that marked the women of Tibet in China is starting to become much more 
rare. Tibetan women have long borne the pink cheeks of living high in the mountains and being exposed to the weather conditions there.
But, due to the prominence of Chinese influence in the media, Tibetan women are being held to the standards that have swept over china. These incredible women who 
simply showed the naturally occurring colour from the lives they live are now being expected to maintain the pale flawless skin that has become the ideal for Asian women.
As you can see, beauty is truly subjective and diverse. Each country has their own take on what is considered ideal. But, if you can approach each place with an open mind, 
you can enjoy the ideals of each and every place you visit. Enjoy the beauty in the environment and the people that live in it.

测试代码:

public static void main(String[] args) {
		HuffmanTree ht = new HuffmanTree();
//      路径根据自己的文件路径填写
		String filePath = "./src/English.txt";
		String text = ht.getTextFromTxt(filePath);
//		System.out.println(text);
		HashMap<Character, Integer> hashMap = ht.text2HashMap(text);
		HuffmanNode[] nodeList = ht.hashMap2HuffmanNode(hashMap);
		ht.root = ht.buildHuffmanTree(nodeList, hashMap.size());
		ht.levelTraversal();	
		
//		 利用哈夫曼树进行编码
		String test = "Enjoy the beauty in the environment and";
		String Encode = ht.getEncodeofString(test);
		System.out.println("\n" + Encode);
//		 利用哈夫曼树进行解码
		System.out.println(ht.getDecodeofCode(Encode));
}

运行结果:

7295 2969 4326 1421 1548 1903 2423 703 718 726 822 885 1018 1159 1264 350 353 354 364 408 414 419 466 498 520 540 619 171 179 204 215 230 236 241 279 303 316 99 105 106 109 110 120 138 141 148 168 46 53 54 66 70 78 82 86 21 25 26 28 35 35 37 41 12 14 14 14 16 19 20 21 6 6 7 7 7 7 8 8 9 10 10 10 10 11 3 3 3 3 3 4 4 4 4 5 5 5 5 6 2 2 2 3 3 3 1 2 1 1 
1001010110011111011011100110101110111101011010010111100001010110000001101010111011100110111111101011010010111010011110010110011000101100111101111010011110101111100011110110
Enjoy the beauty in the environment and

由于该Huffman树实在太大,这里就不给出图片了!

最后给出完整的项目代码:

public class HuffmanNode {
	int weight;
	char c;
	HuffmanNode left;
	HuffmanNode right;
//	HuffmanNode parent;
	boolean isMin = false;
	
	// 构造函数
	public HuffmanNode (int weight, char c) {
		this.weight = weight;
		this.c = c;
		this.left = null;
		this.right = null;
//		this.parent = null;
		this.isMin = false;
	}
}
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.HashMap;
import java.util.LinkedList;
import java.util.Map;
import java.util.Queue;
import java.util.Stack;

public class HuffmanTree {
	HuffmanNode root;
	
	/*
	 * 构造函数
	 */
	public HuffmanTree () {
		root = null;
	}
	
	/*
	 * 从文件中读取文本并保存
	 */
	public String getTextFromTxt (String filePath) {
		String text = "";
		try {
			FileReader fileReader = new FileReader(filePath);
			BufferedReader bufferedReader = new BufferedReader(fileReader);
			
			String str = "";
			while ((str = bufferedReader.readLine()) != null) {
				text += str;
			}
		} catch (FileNotFoundException e) {
			System.out.println("Not found this file!");
		} catch (IOException e) {
			System.out.println("I/O Error!");
		}
		return text;
	}
	
	/*
	 * 得到文本的统计信息,存储在HashMap数据结构中
	 */
	public HashMap<Character, Integer> text2HashMap (String text) {
		HashMap<Character, Integer> hashMap = new HashMap<Character, Integer>();
		for (int i = 0; i < text.length(); i++) {
			if (hashMap.containsKey(text.charAt(i))) {
				hashMap.put(text.charAt(i), hashMap.get(text.charAt(i))+1);
			} else {
				hashMap.put(text.charAt(i), 1);
			}
		}
		return hashMap;
	}
	
	/*
	 * 将hashmap转换成Huffman结点
	 */
	public HuffmanNode[] hashMap2HuffmanNode (HashMap<Character, Integer> hashMap) {
		// 遍历hashmap,并构造nodelist
		HuffmanNode[] nodeList = new HuffmanNode[hashMap.size()*2-1];
		
		int i = 0;
		for(Map.Entry<Character, Integer> mm : hashMap.entrySet()) {
			nodeList[i++] = new HuffmanNode(mm.getValue(), mm.getKey());
		}
		
		// 初始化从i到最后的nodelist
		for (int j = i; j < nodeList.length; j++) {
			nodeList[j] = new HuffmanNode(-1, '#');
		}
		
		return nodeList;
	}
	
	/*
	 * 根据Huffman结点构造HuffmanTree
	 */
	public HuffmanNode buildHuffmanTree (HuffmanNode[] nodeList, int length) {
		int len = nodeList.length - length;
		HuffmanNode parentNode = null;
		for (int i = 0; i <= len-1; i++) {
			int j;
			HuffmanNode min1 = new HuffmanNode(7500, '#');
			HuffmanNode min2 = new HuffmanNode(7500, '#');
			
			// 得到权重最小的两个Huffman结点
			int isMin1Flag = -1;
			int isMin2Flag = -1;
			for (int k = 0; k < length+i; k++) {
				if (!nodeList[k].isMin) {
					if ((nodeList[k].weight < min1.weight)) {
						min2 = min1;
						isMin2Flag = isMin1Flag;
						min1 = nodeList[k];
						isMin1Flag = k;
					} else if ((nodeList[k].weight <= min2.weight)) {
						min2 = nodeList[k];
						isMin2Flag = k;
					}
				}
			}
			
			nodeList[isMin1Flag].isMin = true;
			nodeList[isMin2Flag].isMin = true;
			
			// 创建新的结点并添加到Huffman数组的末尾
			parentNode = new HuffmanNode(min1.weight + min2.weight, '#');
			
			//min1.parent = parentNode;
			//min2.parent = parentNode;
			
			parentNode.left = min1;
			parentNode.right = min2;
			
			nodeList[length+i] = parentNode;
		}
		return parentNode;
	}
	
	/*
	 * 得到对应字符的编码
	 */
	public String getEncodeofString (String test) {
		String Encode = "";
		for (int n = 0; n < test.length(); n++) {
			Encode += getEncodeofCharacter(test.charAt(n));
		}
		return Encode;
	}
	
	public String getEncodeofCharacter (char c) {
		Stack<String> stack1 = new Stack<String>();
		Stack<HuffmanNode> stack2 = new Stack<HuffmanNode>();
		String code = "";
		// 使用类似于后序遍历哈夫曼树
		if (root == null) {
			return "";
		}
		
		HuffmanNode current = root;
		HuffmanNode preNode = null;
		
		while (current != null) {
			stack2.add(current);
			stack1.add("0");
			current = current.left;
		}
		
		// 多入栈了一个0,弹出栈
		stack1.pop();
		
		// 开始判断哈夫曼树结点中的字符是否是我们要查找的
		while (!stack2.isEmpty()) {
			current = stack2.pop();
			
			if (current.right == null || current.right == preNode) {
				preNode = current;
				if (current.c == c) {
					break;
				} else {
					stack1.pop();
				}
			} else {
				stack2.add(current);
				
				current = current.right;
				stack1.add("1");
				stack2.add(current);
				
				while (current.left != null) {
					stack2.add(current.left);
					stack1.add("0");
					current = current.left;
				}
			}
		}
		
		// 退出while循环
		while (!stack1.isEmpty()) {
			code = stack1.pop() + code;
		}
		
		return code;
	}
	
	/*
	 * 根据编码得到对应的字符串
	 */
	public String getDecodeofCode (String encode) {
		String decode = "";
		int len = encode.length();
		int i = 0;
		
		HuffmanNode current = root;
		
		while (i < len) {
			if (encode.charAt(i) == '1') {
				current = current.right;
			} else {
				current = current.left;
			}
			
			if (current.left == null && current.right == null) {
				decode += current.c + "";
				current = root;
			} 
			
			i++;
		}
		return decode;
	}
	
	/*
	 * 使用层次遍历输出Huffman树
	 */
	public void levelTraversal () {
		if (root == null) {
			return;
		}
		
		Queue<HuffmanNode> queue = new LinkedList<HuffmanNode>();
		
		HuffmanNode current = root;
		
		queue.offer(current);
		while (!queue.isEmpty()) {
			current = queue.poll();
			System.out.print(current.weight + " ");
			
			if (current.left != null) {
				queue.offer(current.left);
			}
			
			if (current.right != null) {
				queue.offer(current.right);
			}
		}
	}
	
	/*
	 * 主函数测试Huffman树
	 */
	public static void main(String[] args) {
		HuffmanTree ht = new HuffmanTree();
		String filePath = "./src/English.txt";
		String text = ht.getTextFromTxt(filePath);
		HashMap<Character, Integer> hashMap = ht.text2HashMap(text);
		HuffmanNode[] nodeList = ht.hashMap2HuffmanNode(hashMap);
		ht.root = ht.buildHuffmanTree(nodeList, hashMap.size());
		ht.levelTraversal();	
		
//		 利用哈夫曼树进行编码
		String test = "Enjoy the beauty in the environment and";
		String Encode = ht.getEncodeofString(test);
		System.out.println("\n" + Encode);
//		 利用哈夫曼树进行解码
		System.out.println(ht.getDecodeofCode(Encode));
	}
}

更多推荐

哈夫曼树实现:统计文本信息,构造哈夫曼树,并对其进行编码与解码