学习Hadoop Common模块,当然应该是从最简单,最基础的模块学习最好,所以我挑选了其中的conf配置模块进行学习。整体的类结构非常简单。
只要继承了Configurable接口,一般表明就是可配置的,可以执行相应的配置操作,但是配置的集中操作的体现是在Configuration这个类中。这个类中定义了很多的集合变量:
/** * List of configuration resources. */ private ArrayList resources = new ArrayList(); /** * List of configuration parameters marked final. * finalParameters集合中保留的是final修饰的不可变的参数 */ private Set finalParameters = new HashSet (); /** * 是否加载默认资源配置 */ private boolean loadDefaults = true; /** * Configuration objects * Configuration对象 */ private static final WeakHashMap REGISTRY = new WeakHashMap (); /** * List of default Resources. Resources are loaded in the order of the list * entries */ private static final CopyOnWriteArrayList defaultResources = new CopyOnWriteArrayList (); 上面只是列举出了一部分,基本的用途都是拿来保存一些资源的数据。还有一个变量比较关键: //资源配置文件中的属性会加载到Properties属性中来 private Properties properties;所有的属性变量都是存放到java中的Properties中存放,便于后面的直接存取。Property其实就是一个HashTable。我们按着Configuration加载的顺序来学习一下他的整个过程。首先当然是执行初始化代码块: static{ //print deprecation warning if hadoop-site.xml is found in classpath ClassLoader cL = Thread.currentThread().getContextClassLoader(); if (cL == null) { cL = Configuration.class.getClassLoader(); } if(cL.getResource("hadoop-site.xml")!=null) { LOG.warn("DEPRECATED: hadoop-site.xml found in the classpath. " + "Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, " + "mapred-site.xml and hdfs-site.xml to override properties of " + "core-default.xml, mapred-default.xml and hdfs-default.xml " + "respectively"); } //初始化中加载默认配置文件,core-site是用户的属性定义 //如果有相同,后者的属性会覆盖前者的属性 addDefaultResource("core-default.xml"); addDefaultResource("core-site.xml"); }学习过java构造函数的执行顺序的同学,应该知道初始化代码块中的代码的执行顺序是先于构造函数的,所以会执行完上面的操作,就来到了addDefaultResource(): /** * Add a default resource. Resources are loaded in the order of the resources * added. * @param name file name. File should be present in the classpath. */ public static synchronized void addDefaultResource(String name) { if(!defaultResources.contains(name)) { defaultResources.add(name); //遍历注册过的资源配置,进行重新加载操作 for(Configuration conf : REGISTRY.keySet()) { if(conf.loadDefaults) { conf.reloadConfiguration(); } } } }把资源的名字加入到相应的集合中,然后遍历每个配置类,重新加载配置操作,因为默认资源列表改动了,所以要重新加载了,这个也好理解。这里简单介绍一下,每一个Configuration类初始化后,都会加入到REGISTRY集合中,这是一个static 变量,所以会保持全局统一的一个。然后把重点移到reloadConfiguration(): /** * Reload configuration from previously added resources. * * This method will clear all the configuration read from the added * resources, and final parameters. This will make the resources to * be read again before accessing the values. Values that are added * via set methods will overlay values read from the resources. */ public synchronized void reloadConfiguration() { //重新加载Configuration就是重新将里面的属性记录清空 properties = null; // trigger reload finalParameters.clear(); // clear site-limits }操作非常简单,就是clear一些操作,也许这时候,你会想难道不用马上加载新的资源吗?其实这也是作者的一大设计,答案在后面。好的,程序执行到这里,初始化代码块的操作完成了,接下来就是构造函数的执行了: /** A new configuration. */ public Configuration() { //初始化是需要加载默认资源的 this(true); }然后继续调用重载函数: /** A new configuration where the behavior of reading from the default * resources can be turned off. * * If the parameter {@code loadDefaults} is false, the new instance * will not load resources from the default files. * @param loadDefaults sp
//资源配置文件中的属性会加载到Properties属性中来 private Properties properties;
static{ //print deprecation warning if hadoop-site.xml is found in classpath ClassLoader cL = Thread.currentThread().getContextClassLoader(); if (cL == null) { cL = Configuration.class.getClassLoader(); } if(cL.getResource("hadoop-site.xml")!=null) { LOG.warn("DEPRECATED: hadoop-site.xml found in the classpath. " + "Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, " + "mapred-site.xml and hdfs-site.xml to override properties of " + "core-default.xml, mapred-default.xml and hdfs-default.xml " + "respectively"); } //初始化中加载默认配置文件,core-site是用户的属性定义 //如果有相同,后者的属性会覆盖前者的属性 addDefaultResource("core-default.xml"); addDefaultResource("core-site.xml"); }
/** * Add a default resource. Resources are loaded in the order of the resources * added. * @param name file name. File should be present in the classpath. */ public static synchronized void addDefaultResource(String name) { if(!defaultResources.contains(name)) { defaultResources.add(name); //遍历注册过的资源配置,进行重新加载操作 for(Configuration conf : REGISTRY.keySet()) { if(conf.loadDefaults) { conf.reloadConfiguration(); } } } }
/** * Reload configuration from previously added resources. * * This method will clear all the configuration read from the added * resources, and final parameters. This will make the resources to * be read again before accessing the values. Values that are added * via set methods will overlay values read from the resources. */ public synchronized void reloadConfiguration() { //重新加载Configuration就是重新将里面的属性记录清空 properties = null; // trigger reload finalParameters.clear(); // clear site-limits }
/** A new configuration. */ public Configuration() { //初始化是需要加载默认资源的 this(true); }
/** A new configuration where the behavior of reading from the default * resources can be turned off. * * If the parameter {@code loadDefaults} is false, the new instance * will not load resources from the default files. * @param loadDefaults sp