Finding_NN1 and_CC Archiving_VVG the_AT Internet_NN1 Footprint_NN1 With_IW the_AT introduction_NN1 of_IO home_NN1 computers_NN2 and_CC electronic_JJ typewriters_NN2 in_II the_AT late_JJ 1970s_MC2 ,_, archivists_NN2 were_VBDR forced_VVN to_TO confront_VVI the_AT fact_NN1 that_CST a_AT1 person_NN1 's_GE "_" papers_NN2 "_" might_VM ,_, in_II fact_NN1 ,_, no_RR21 longer_RR22 be_VBI on_II paper_NN1 ._. 
The_AT power_NN1 of_IO word_NN1 processing_NN1 made_VVD writers_NN2 among_II the_AT first_MD to_TO embrace_VVI information_NN1 technology_NN1 outside_II21 of_II22 government_NN1 and_CC the_AT financial_JJ sector_NN1 ._. 
And_CC because_CS writers_NN2 often_RR made_VVN small_JJ purchases_NN2 and_CC were_VBDR not_XX constrained_VVN by_II prior_JJ investment_NN1 ,_, they_PPHS2 frequently_RR purchased_VVN equipment_NN1 from_II small_JJ niche_NN1 manufacturers_NN2 whose_DDQGE technology_NN1 did_VDD not_XX become_VVI dominant_JJ ._. 
As_II a_AT1 result_NN1 ,_, preserving_VVG and_CC cataloging_VVG the_AT earliest_JJT electronic_JJ records_NN2 consisted_VVN of_IO two_MC intertwined_JJ problems_NN2 :_: the_AT task_NN1 of_IO finding_NN1 and_CC copying_VVG the_AT data_NN off_II magnetic_JJ media_NN before_II the_AT media_NN deteriorates_VVZ ,_, and_CC the_AT challenging_JJ of_IO reading_VVG older_JJR and_CC sometimes_RT obscure_VV0 formats_NN2 that_CST are_VBR no_RR21 longer_RR22 in_II widespread_JJ use_NN1 ._. 
Archivists_NN2 are_VBR now_RT on_II the_AT brink_NN1 of_IO a_AT1 far_RG more_RGR disruptive_JJ change_NN1 than_CSN the_AT transition_NN1 from_II paper_NN1 to_II electronic_JJ media_NN :_: the_AT transition_NN1 from_II personal_JJ to_TO "_" cloud_VVI computing_NN1 ._. "_" 
In_II the_AT very_RG near_JJ future_NN1 an_AT1 archivist_NN1 might_VM enter_VVI the_AT office_NN1 of_IO a_AT1 deceased_JJ writer_NN1 and_CC find_VV0 no_AT electronic_JJ files_NN2 of_IO personal_JJ significance_NN1 :_: the_AT author_NN1 's_GE appointment_NN1 calendar_NN1 might_VM split_VVI between_II her_APPGE organization_NN1 's_GE Microsoft_NP1 Exchange_NN1 server_NN1 and_CC Yahoo_NP1 Calendar_NN1 ;_; her_APPGE unfinished_JJ and_CC unpublished_JJ documents_NN2 stored_VVN on_II Google_NP1 Docs_NN2 ;_; her_APPGE diary_NN1 stored_VVN at_II the_AT online_JJ LiveJournal_JJ service_NN1 ;_; correspondence_NN1 archived_VVD on_II the_AT Facebook_NN1 "_" walls_NN2 "_" of_IO her_APPGE close_JJ friends_NN2 ;_; and_CC her_APPGE most_RGT revealing_JJ ,_, insightful_JJ and_CC critical_JJ comments_NN2 scattered_VVN as_CSA anonymous_JJ and_CC pseudonymous_JJ comments_NN2 on_II the_AT blogs_NN2 of_IO her_APPGE friends_NN2 ,_, collaborators_NN2 ,_, and_CC rivals_NN2 ._. 
Although_CS there_EX are_VBR numerous_JJ public_NN1 and_CC commercial_JJ projects_NN2 underway_RR to_TO find_VVI and_CC preserve_VVI public_JJ web-based_JJ content_NN1 ,_, these_DD2 projects_NN2 will_VM not_XX be_VBI useful_JJ to_II future_JJ historians_NN2 if_CS there_EX is_VBZ no_AT way_NN1 to_TO readily_RR find_VVI the_AT information_NN1 that_CST is_VBZ of_IO interest_NN1 ._. 
And_CC of_RR21 course_RR22 ,_, none_PN of_IO the_AT archiving_NN1 projects_NN2 are_VBR able_JK to_TO archive_VVI content_JJ that_CST is_VBZ private_JJ or_CC otherwise_RR restricted_JJ --_NN1 as_CSA will_VM increasingly_RR be_VBI the_AT case_NN1 of_IO personal_JJ information_NN1 that_CST is_VBZ stored_VVN in_II the_AT "_" cloud_NN1 ._. "_" 
This_DD1 paper_NN1 introduces_VVZ and_CC explores_VVZ the_AT problem_NN1 of_IO finding_NN1 and_CC archiving_VVG person_NN1 's_GE Internet_NN1 footprint_NN1 ._. 
In_II Section_NN1 2_MC we_PPIS2 define_VV0 the_AT term_NN1 Internet_NN1 footprint_NN1 and_CC provide_VV0 numerous_JJ examples_NN2 of_IO the_AT footprint_NN1 's_GE extent_NN1 ._. 
In_II Section_NN1 3_MC we_PPIS2 present_VV0 a_AT1 variety_NN1 of_IO approaches_NN2 for_IF finding_VVG the_AT footprint_NN1 ._. 
In_II Section_NN1 4_MC we_PPIS2 discuss_VV0 technical_JJ concerns_NN2 for_IF archiving_VVG the_AT footprint_NN1 ._. 
Web_NN1 archiving_NN1 has_VHZ received_VVN significant_JJ exploration_NN1 in_II recent_JJ years_NNT2 ,_, including_II the_AT use_NN1 of_IO proxies_NN2 to_TO collect_VVI data_NN ,_, the_AT need_NN1 for_IF proper_JJ record_NN1 management_NN1 ,_, and_CC the_AT difficulty_NN1 of_IO reconstructing_VVG lost_JJ websites_NN2 from_II the_AT web_NN1 infrastructure_NN1 ._. 
Researchers_NN2 have_VH0 also_RR characterized_VVN the_AT Web_NN1 's_GE "_" decay_NN1 "_" ._. 
Jatowt_VV0 et_RA21 al_RA22 ._. 
have_VH0 developed_VVN techniques_NN2 for_IF automatically_RR detecting_VVG the_AT age_NN1 of_IO a_AT1 web_NN1 page_NN1 ._. 
Juola_NN1 provides_VVZ a_AT1 review_NN1 of_IO current_JJ authorship_NN1 determination_NN1 techniques_NN2 ._. 
There_EX are_VBR numerous_JJ open_JJ source_NN1 and_CC commercially_RR available_JJ face_NN1 recognition_NN1 products_NN2 ,_, including_II FaceIt_NP1 by_II Visionics_NP1 ,_, FavesVACS_NP2 by_II Plettac_NP1 ,_, and_CC ImageWare_NP1 Software_NN1 ._. 
Zhao_NP1 et_RA21 al_RA22 ._. 
and_CC Datta_NP1 et_RA21 al_RA22 ._. 
have_VH0 both_DB2 published_VVN comprehensive_JJ surveys_NN2 of_IO current_JJ research_NN1 and_CC technology_NN1 ._. 
Viegas_NP1 et_RA21 al_RA22 ._. 
examined_JJ cooperation_NN1 and_CC conflict_NN1 between_II authors_NN2 by_II analyzing_VVG Wikipedia_NP1 logs_NN2 ._. 
Other_JJ relevant_JJ work_NN1 on_II Wikipedia_NP1 includes_VVZ analysis_NN1 of_IO participation_NN1 and_CC statistical_JJ models_NN2 that_CST can_VM predict_VVI future_JJ administrators_NN2 ._. 
Consider_VV0 the_AT staggering_JJ range_NN1 of_IO Internet_NN1 services_NN2 that_CST a_AT1 person_NN1 uses_VVZ during_II the_AT course_NN1 of_IO a_AT1 year_NNT1 ._. 
Some_DD of_IO these_DD2 are_VBR public_JJ publication_NN1 services_NN2 like_II BBC_NP1 or_CC CNN_NP1 News_NN1 --_NN1 services_NN2 that_CST are_VBR little_RR more_DAR than_CSN traditional_JJ television_NN1 ,_, radio_NN1 or_CC newspaper_NN1 repurposed_VVN to_II the_AT Internet_NN1 ,_, and_CC that_CST most_DAT Internet_NN1 users_NN2 access_VV0 anonymously_RR ._. 
Other_JJ services_NN2 are_VBR public_JJ and_CC highly_RR personalized_VVD --_JJ blogs_NN2 and_CC home_NN1 pages_NN2 ,_, for_REX21 example_REX22 ._. 
Still_RR other_JJ services_NN2 are_VBR private_JJ and_CC personal_JJ ,_, like_II an_AT1 online_JJ calendar_NN1 or_CC diary_NN1 ._. 
These_DD2 services_NN2 can_VM be_VBI operated_VVN by_II an_AT1 organization_NN1 for_IF its_APPGE employees_NN2 ,_, such_II21 as_II22 a_AT1 company_NN1 running_VVG a_AT1 Microsoft_NP1 Exchange_NN1 server_NN1 ,_, or_CC they_PPHS2 can_VM be_VBI operated_VVN on_II a_AT1 global_JJ scale_NN1 for_IF millions_NNO2 of_IO users_NN2 ,_, such_II21 as_II22 Google_NP1 Calendar_NN1 ._. 
This_DD1 section_NN1 considers_VVZ the_AT wide_JJ range_NN1 of_IO information_NN1 that_CST an_AT1 originator_NN1 may_VM create_VVI in_II other_JJ computers_NN2 on_II the_AT Internet_NN1 through_II their_APPGE own_DA actions_NN2 --_NN1 the_AT originator_NN1 's_GE Internet_NN1 Footprint_NN1 ._. 
A_AT1 person_NN1 's_VBZ public_RR identified_VVN footprint_NN1 is_VBZ any_DD information_NN1 that_CST they_PPHS2 created_VVD which_DDQ is_VBZ online_RR ,_, widely_RR available_JJ ,_, and_CC specifically_RR linked_VVN to_II author_NN1 's_GE real_JJ name_NN1 ._. 
For_IF originators_NN2 that_CST are_VBR authors_NN2 ,_, their_APPGE public_JJ footprint_NN1 almost_RR certainly_RR includes_VVZ articles_NN2 that_CST have_VH0 been_VBN published_VVN under_II the_AT originator_NN1 's_GE own_DA name_NN1 in_II web-only_JJ publications_NN2 such_II21 as_II22 Slate_NN1 Magazine_NN1 or_CC Salon.com_NP1 ._. 
The_AT public_JJ footprint_NN1 may_VM also_RR include_VVI letters_NN2 to_II the_AT editor_NN1 ._. 
(_( John_NP1 Updike_NP1 once_RR wrote_VVD a_AT1 letter_NN1 to_II the_AT editor_NN1 of_IO the_AT Boston_NP1 Globe_NN1 advocating_VVG that_CST the_AT comics_NN2 page_NN1 retain_VV0 "_" Spiderman_NN1 ._. "_" )_) 
Individuals_NN2 may_VM also_RR publish_VVI their_APPGE own_DA writing_NN1 on_II personal_JJ web_NN1 sites_NN2 (_( "_" home_NN1 pages_NN2 "_" and_CC "_" blogs_NN2 "_" )_) ._. 
Websites_NN2 can_VM not_XX be_VBI relied_VVN upon_II to_TO archive_VVI their_APPGE ownmaerial_JJ ,_, because_CS the_AT websites_NN2 may_VM not_XX exist_VVI in_II the_AT future_NN1 ._. 
For_REX21 example_REX22 ,_, in_II the_AT late_JJ 1990s_MC2 thousands_NNO2 of_IO articles_NN2 and_CC columns_NN2 by_II leading_JJ writers_NN2 were_VBDR published_VVN at_II HotWired_JJ ,_, a_AT1 web_NN1 property_NN1 operated_VVN by_II Wired_JJ News_NN1 ._. 
Wired_JJ News_NN1 was_VBDZ eventually_RR sold_VVN to_II Lycos_NP2 ,_, then_RT to_II Conde_NP1 Nast_NP1 ._. 
Numerous_JJ articles_NN2 were_VBDR lost_VVN during_II these_DD2 transfers_NN2 ;_; those_DD2 that_CST are_VBR still_RR available_JJ online_JJ are_VBR not_XX at_II their_APPGE original_JJ Internet_NN1 location_NN1 (_( http_NNU :_: //www.hotwired.com_FU )_) ,_, but_CCB are_VBR now_RT housed_VVN underneath_II the_AT http_NNU :_: //www.wired.com_FU domain_NN1 ._. 
Many_DA2 links_NN2 to_II ,_, between_II and_CC even_RR within_II the_AT articles_NN2 have_VH0 been_VBN broken_VVN as_II a_AT1 result_NN1 ._. 
One_MC1 way_NN1 to_TO retrieve_VVI no_RR21 longer_RR22 extant_JJ web_NN1 pages_NN2 is_VBZ hrough_NN1 the_AT use_NN1 of_IO the_AT Internet_NP1 "_" WayBackMachine_NP1 ,_, "_" operated_VVN by_II the_AT Internet_NN1 Archive_NN1 ._. 
But_CCB here_RL there_EX are_VBR several_DA2 problems_NN2 :_: The_AT Internet_NN1 Archive_NN1 is_VBZ itself_PPX1 another_DD1 organization_NN1 (_( in_II this_DD1 case_NN1 a_AT1 forprofit_NN1 business_NN1 )_) which_DDQ may_VM cease_VVI operation_NN1 at_II some_DD point_NN1 in_II the_AT future_NN1 ._. 
The_AT Archive_NN1 's_GE coverage_NN1 is_VBZ necessarily_RR incomplete_JJ ._. 
The_AT Internet_NN1 Archive_NN1 may_VM not_XX be_VBI accurate_JJ ._. 
(_( Fred_NP1 Cohen_NP1 has_VHZ demonstrated_VVN that_CST the_AT content_NN1 of_IO "_" past_JJ "_" pages_NN2 on_II the_AT Internet_NN1 Way_NN1 Back_NN1 machine_NN1 can_VM be_VBI manipulated_VVN from_II the_AT future_JJ --_NN1 a_AT1 disturbing_JJ fact_NN1 when_CS one_PN1 considers_VVZ that_CST the_AT reports_NN2 from_II WayBack_NP1 machine_NN1 have_VH0 been_VBN entered_VVN into_II evidence_NN1 in_II legal_JJ cases_NN2 without_IW challenge_NN1 from_II opposing_JJ counsel_NN1 ._. )_) 
The_AT WayBack_NN1 machine_NN1 will_VM not_XX archive_VVI websites_NN2 that_CST are_VBR blocked_VVN with_IW an_AT1 appropriate_JJ robots_NN2 exclusion_NN1 file_NN1 robots.txt_NNU ._. 
This_DD1 was_VBDZ especially_RR a_AT1 problem_NN1 for_IF the_AT "_" Journalspace_NP1 "_" online_JJ journal_NN1 ,_, which_DDQ was_VBDZ wiped_VVN out_RP on_II January_NPM1 2_MC ,_, 2009_MC due_II21 to_II22 an_AT1 operator_NN1 error_NN1 and_CC the_AT lack_NN1 of_IO backups_NN2 ._. 
As_CSA it_PPH1 turns_VVZ out_RP ,_, Journalspace_NP1 had_VHD a_AT1 robots.txt_NNU file_NN1 that_CST prohibited_VVD archiving_VVG by_II services_NN2 such_II21 as_II22 Internet_NP1 Archive_NN1 and_CC Google_NP1 ._. 
Rather_II21 than_II22 hoping_VVG that_CST another_DD1 organization_NN1 has_VHZ managed_VVN to_TO sweep_VVI up_RP an_AT1 individual_NN1 's_GE relevant_JJ web_NN1 pages_NN2 in_II a_AT1 global_JJ cataloging_NN1 of_IO the_AT Internet_NN1 ,_, it_PPH1 almost_RR certainly_RR makes_VVZ more_DAR sense_NN1 for_IF archivists_NN2 to_TO go_VVI out_RP and_CC get_VVI the_AT material_NN1 themselves_PPX2 ._. 
The_AT Public_JJ Footprint_NN1 may_VM also_RR contain_VVI information_NN1 at_II social_JJ networking_NN1 websites_NN2 such_II21 as_II22 Facebook_NP1 ,_, MySpace_NP1 and_CC LinkedIn_NP1 ._. 
These_DD2 websites_NN2 contains_VVZ not_XX just_RR information_NN1 that_CST a_AT1 person_NN1 posted_VVD ,_, but_CCB documentation_NN1 of_IO a_AT1 person_NN1 's_GE social_JJ network_NN1 --_NN1 their_APPGE "_" friends_NN2 "_" and_CC associates_NN2 --_NN1 as_II31 well_II32 as_II33 documentation_NN1 of_IO a_AT1 person_NN1 's_GE preferences_NN2 in_II the_AT form_NN1 of_IO "_" recommendations_NN2 "_" messages_NN2 ._. 
Websites_NN2 such_II21 as_II22 Flickr_NP1 and_CC Picassa_NP1 hold_NN1 photographs_VVZ that_CST a_AT1 person_NN1 may_VM have_VHI uploaded_VVN ._. 
What_DDQ a_AT1 treasure_NN1 for_IF future_JJ historians_NN2 trying_VVG to_TO understand_VVI the_AT life_NN1 of_IO an_AT1 individual_JJ !_! 
What_DDQ a_AT1 quandary_NN1 for_IF an_AT1 archivist_NN1 ,_, for_IF these_DD2 websites_NN2 actively_RR encourage_VV0 originators_NN2 to_TO intermix_VVI the_AT personal_JJ and_CC the_AT professional_JJ ._. 
Only_RR through_II consultation_NN1 with_IW families_NN2 and_CC other_JJ interested_JJ parties_NN2 will_VM archivists_NN2 be_VBI able_JK to_TO determine_VVI which_DDQ "_" personal_JJ "_" information_NN1 should_VM be_VBI made_VVN immediately_RR available_JJ ,_, which_DDQ information_NN1 should_VM be_VBI kept_VVN in_II closed_JJ collections_NN2 until_CS a_AT1 suitable_JJ amount_NN1 of_IO time_NNT1 has_VHZ passed_VVN ,_, and_CC what_DDQ should_VM be_VBI destroyed_VVN ._. 
Finally_RR ,_, a_AT1 person_NN1 's_GE public_JJ footprint_NN1 might_VM contain_VVI information_NN1 that_CST the_AT person_NN1 thinks_VVZ is_VBZ private_JJ but_CCB which_DDQ is_VBZ ,_, in_II act_NN1 ,_, public_NN1 ._. 
It_PPH1 is_VBZ notoriously_RR difficult_JJ to_TO audit_VVI security_NN1 set_NN1 ings_NN2 because_CS they_PPHS2 are_VBR complex_JJ and_CC not_XX generally_RR apparent_JJ within_II today_RT 's_GE user_NN1 interfaces_NN2 ._. 
As_II a_AT1 result_NN1 ,_, it_PPH1 is_VBZ common_JJ for_IF computer_NN1 users_NN2 to_TO make_VVI information_NN1 publicly_RR available_JJ when_CS they_PPHS2 do_VD0 not_XX intend_VVI to_TO do_VDI so_RR ._. 
Good_JJ and_CC Krekelberg_NP1 explored_VVD the_AT Kazaa_NP1 user_NN1 interface_NN1 and_CC discovered_VVD that_CST it_PPH1 was_VBDZ relatively_RR easy_JJ for_IF individuals_NN2 to_TO "_" share_VVI "_" their_APPGE entire_JJ hard_JJ drive_NN1 to_II a_AT1 file_NN1 sharing_VVG network_NN1 when_CS they_PPHS2 intended_VVD to_TO just_RR share_VVI a_AT1 few_DA2 documents_NN2 or_CC folders_NN2 ._. 
Sometimes_RT such_DA inadvertent_JJ public_JJ sharing_NN1 can_VM have_VHI important_JJ political_JJ ,_, social_JJ ,_, or_CC historical_JJ dimensions_NN2 :_: in_II June_NPM1 2008_MC ,_, Judge_NP1 Alex_NP1 Kozinski_NN1 of_IO the_AT 9th_MD US_NP1 Circuit_NN1 Court_NN1 of_IO Appeals_NN2 was_VBDZ found_VVN to_TO have_VHI sexually_RR explicit_JJ photos_NN2 and_CC videos_NN2 on_II his_APPGE own_DA personal_JJ website_NN1 --_JJ relevant_JJ ,_, as_CSA the_AT Judge_NN1 was_VBDZ himself_PPX1 overseeing_VVG an_AT1 obscenity_NN1 trial_NN1 ._. 
Although_CS not_XX strictly_RR part_NN1 of_IO the_AT "_" Internet_NN1 "_" footprint_NN1 ,_, many_DA2 organizations_NN2 operate_VV0 their_APPGE own_DA data_NN services_NN2 on_II which_DDQ an_AT1 originator_NN1 could_VM easily_RR store_VVI information_NN1 ._. 
For_REX21 example_REX22 ,_, many_DA2 businesses_NN2 and_CC organizations_NN2 run_VV0 their_APPGE own_DA web-based_JJ calendar_NN1 and_CC email_NN1 services_NN2 ._. 
These_DD2 services_NN2 may_VM also_RR cause_VVI problems_NN2 for_IF archivists_NN2 because_CS they_PPHS2 can_VM be_VBI hard_JJ to_TO find_VVI and_CC may_VM not_XX be_VBI readily_RR interested_JJ in_II sharing_VVG their_APPGE information_NN1 --_NN1 even_CS21 when_CS22 the_AT originator_NN1 or_CC the_AT originator_NN1 's_GE family_NN1 strongly_RR favor_VV0 information_NN1 sharing_VVG ._. 
Beyond_II the_AT information_NN1 that_CST a_AT1 person_NN1 published_VVN under_II their_APPGE own_DA name_NN1 ,_, there_EX is_VBZ potentially_RR a_AT1 wealth_NN1 of_IO information_NN1 that_CST is_VBZ publicly_RR available_JJ but_CCB published_VVD under_II a_AT1 different_JJ name_NN1 or_CC a_AT1 non-standard_JJ email_NN1 address_NN1 --_NN1 an_AT1 electronic_JJ pseudonym_NN1 ._. 
There_EX are_VBR may_VM reasons_NN2 why_RRQ an_AT1 individual_NN1 might_VM publish_VVI information_NN1 to_II the_AT public_NN1 using_VVG a_AT1 pseudonym_NN1 :_: Information_NN1 might_VM be_VBI published_VVN under_II a_AT1 different_JJ name_NN1 in_II an_AT1 attempt_NN1 to_TO preserve_VVI privacy_NN1 ._. 
The_AT individual_NN1 might_VM have_VHI a_AT1 well-established_JJ pen_NN1 name_NN1 (_( for_REX21 example_REX22 ,_, Charles_NP1 Lutwidge_NP1 Dodgson_NP1 blogging_VVG as_CSA Lewis_NP1 Caroll_NP1 )_) ._. 
The_AT individual_NN1 might_VM be_VBI a_AT1 fiction_NN1 writer_NN1 and_CC be_VBI publishing_VVG the_AT information_NN1 online_RR using_VVG the_AT persona_NN1 of_IO a_AT1 fictional_JJ character_NN1 (_( for_REX21 example_REX22 ,_, Dodgson_NP1 blogging_VVG as_II the_AT Queen_NN1 of_IO Hearts_NN2 )_) ._. 
The_AT information_NN1 might_VM appear_VVI in_II an_AT1 online_JJ forum_NN1 where_CS there_EX is_VBZ a_AT1 community_NN1 norm_NN1 that_CST prohibits_VVZ publishing_VVG information_NN1 under_II a_AT1 "_" real_JJ name_NN1 ,_, "_" or_CC the_AT online_JJ forum_NN1 might_VM assign_VVI pseudonyms_NN2 as_II a_AT1 matter_NN1 of_RR21 course_RR22 ._. 
Another_DD1 person_NN1 might_VM already_RR be_VBI using_VVG the_AT individual_NN1 's_GE name_NN1 ,_, forcing_VVG the_AT originator_NN1 to_TO pick_VVI a_AT1 different_JJ name_NN1 ._. 
The_AT individual_NN1 might_VM be_VBI a_AT1 government_NN1 or_CC corporate_JJ official_NN1 and_CC be_VBI prohibited_VVN from_II posting_VVG under_II their_APPGE own_DA name_NN1 for_IF policy_NN1 reasons_NN2 ._. 
(_( For_REX21 example_REX22 ,_, Whole_JJ Foods_NN2 President_NNB John_NP1 P._NP1 Mackey_NP1 blogged_VVD under_II the_AT pseudonym_NN1 Rahobed_VVD ,_, a_AT1 play_NN1 on_II his_APPGE wife_NN1 's_GE name_NN1 Deborah_NP1 ._. )_) 
Another_DD1 way_NN1 to_TO locate_VVI the_AT originator_NN1 's_GE Internet_NN1 footprint_NN1 is_VBZ by_II searching_VVG for_IF it_PPH1 ._. 
Two_MC kinds_NN2 of_IO search_NN1 are_VBR possible_JJ ._. 
First_MD ,_, the_AT archivist_NN1 could_VM simply_RR search_VVI for_IF the_AT originator_NN1 's_GE name_NN1 (_( or_CC aliases_NN2 )_) on_II Internet_NP1 search_NN1 systems_NN2 such_II21 as_II22 Google_NP1 and_CC Yahoo_NP1 ._. 
Second_MD ,_, the_AT archivist_NN1 could_VM go_VVI specifically_RR to_II websites_NN2 such_II21 as_II22 Facebook_NP1 ,_, MySpace_NP1 and_CC Flickr_NP1 ,_, and_CC conduct_VV0 searches_NN2 there_RL ._. 
Search_NN1 is_VBZ complicated_VVN by_II the_AT fact_NN1 that_CST many_DA2 people_NN share_VV0 the_AT same_DA name_NN1 ._. 
Bekkerman_NN1 and_CC McCallum_NP1 note_VV0 that_CST a_AT1 search_NN1 for_IF the_AT name_NN1 "_" David_NP1 Mulford_NP1 "_" on_II Google_NP1 correctly_RR retrieves_VVZ information_NN1 about_II a_AT1 US_NP1 Ambassador_NN1 to_II India_NP1 ,_, "_" two_MC business_NN1 managers_NN2 ,_, a_AT1 musician_NN1 ,_, a_AT1 student_NN1 ,_, a_AT1 scientist_NN1 ,_, and_CC a_AT1 few_DA2 others_NN2 "_" --_NN1 all_DB people_NN who_PNQS share_VV0 the_AT same_DA name_NN1 ._. 
Which_DDQ DavidMulford_NP1 is_VBZ the_AT "_" right_JJ "_" David_NP1 Mulford_NP1 depends_VVZ on_II which_DDQ one_PN1 the_AT context_NN1 of_IO the_AT search_NN1 ._. 
Sometimes_RT it_PPH1 is_VBZ difficult_JJ to_TO determine_VVI if_CS two_MC seemly_RR different_JJ individuals_NN2 are_VBR in_II fact_NN1 the_AT same_DA person_NN1 ._. 
Consider_VV0 again_RT the_AT search_NN1 for_IF "_" David_NP1 Mulford_NP1 :_: "_" There_EX is_VBZ an_AT1 old_JJ story_NN1 of_IO an_AT1 assistant_NN1 at_II MIT_NP1 who_PNQS worked_VVD for_IF a_AT1 famous_JJ professor_NN1 in_II one_MC1 of_IO the_AT physical_JJ science_NN1 departments_NN2 ._. 
One_MC1 day_NNT1 the_AT professor_NN1 died_VVD after_II a_AT1 long_JJ illness_NN1 ._. 
Shortly_RR thereafter_RT ,_, the_AT assistant_NN1 received_VVD a_AT1 phone_NN1 call_NN1 from_II the_AT Institute_NN1 Archivist_NN1 who_PNQS wanted_VVD to_TO stop_VVI by_RP and_CC evaluate_VVI the_AT professor_NN1 's_GE papers_NN2 ._. 
The_AT assistant_NN1 said_VVD that_CST she_PPHS1 had_VHD been_VBN expecting_VVG the_AT archivist_NN1 and_CC had_VHD already_RR "_" cleaned_VVD them_PPHO2 up_RP "_" in_II anticipation_NN1 of_IO the_AT visit_NN1 ._. 
When_CS the_AT archivist_NN1 arrived_VVD the_AT extent_NN1 of_IO the_AT cleaning_NN1 became_VVD evident_JJ :_: the_AT assistant_NN1 had_VHD thrown_VVN out_RP the_AT professor_NN1 's_GE scratch_NN1 pads_NN2 ,_, his_APPGE doodles_NN2 ,_, a_AT1 box_NN1 of_IO business_NN1 receipts_NN2 ,_, and_RR31 so_RR32 on_RR33 ,_, and_CC prepared_VVN for_IF the_AT archivist_NN1 a_AT1 neat_JJ folder_NN1 showing_VVG all_DB of_IO the_AT professor_NN1 's_GE speeches_NN2 ,_, published_VVD articles_NN2 ,_, and_CC honors_NN2 ._. 
The_AT archivist_NN1 was_VBDZ devastated_VVN ._. 
Although_CS many_DA2 archivists_NN2 know_VV0 that_CST they_PPHS2 may_VM need_VVI to_TO act_VVI with_IW haste_NN1 in_BCL21 order_BCL22 to_TO preserve_VVI the_AT physical_JJ papers_NN2 of_IO the_AT deceased_JJ ,_, this_DD1 story_NN1 of_IO the_AT archivist_NN1 and_CC the_AT assistant_NN1 is_VBZ in_II danger_NN1 of_IO playing_VVG out_RP with_IW great_JJ frequency_NN1 in_II tomorrow_RT 's_GE cloud-based_JJ world_NN1 of_IO electronic_JJ records_NN2 ._. 
For_REX21 example_REX22 ,_, photo_NN1 sharing_VVG websites_NN2 such_II21 as_II22 AOL_NP1 Pictures_NN2 have_VH0 deleted_VVN uploaded_JJ pictures_NN2 that_CST are_VBR not_XX viewed_VVN after_II 60_MC days_NNT2 ,_, or_CC when_CS the_AT owner_NN1 of_IO the_AT account_NN1 fails_VVZ to_TO log_VVI in_RP after_II 90_MC days_NNT2 ._. 
Some_DD services_NN2 delete_VV0 photos_NN2 when_RRQ monthly_JJ fees_NN2 are_VBR no_RR21 longer_RR22 paid_VVN ._. 
Archivists_NN2 would_VM need_VVI to_TO move_VVI fast_RR to_TO rescue_VVI an_AT1 originator_NN1 's_GE photos_NN2 stored_VVN on_II such_DA a_AT1 service_NN1 ._. 
