RSS feed sometimes contains http and other time https links

  • RSS feed of one site hosted on http://wordpress.comhttps://rychlofky.cz/feed is alternatively generated with http links and other time of a day with https.

    Compare:
    https://pastebin.com/nvp5sEZe fetched on Feb 28 12:06 CET
    https://pastebin.com/f13ChBE1 fetched on Feb 28 22:58 CET

    Is that something that an owner could change/control?

  • Hello there,

    Many thanks for reaching out.

    SSL certificates are automatically applied on WordPress.com, when I look at the feed it looks like there’s an HTTPS in use there.

    Are you trying to use the feed and running into problems? Perhaps we can offer some insights into this.

    Many thanks in advance.

  • If you look inside of the feed, you’ll see links to http, e.g. lines 11-14 from 22:58, or even now if you fetch https://rychlofky.cz/feed/:

    <channel>
    <title>rychlofky</title>
    <atom:link href="http://rychlofky.cz/feed/" rel="self" type="application/ rss+xml" />
    <link>http://rychlofky.cz</link>

    whereas at 12:06 it was:

    <channel>
    <title>rychlofky</title>
    <atom:link href="https://rychlofky.cz/feed/" rel="self" type="application/rss+xml" />
    <link>https://rychlofky.cz</link>

    Yes, RSS reader https://www.inoreader.com/ displays duplicate entries. Which is not surprising – because URLs are used in guid – once with http and second one with https.

  • Maybe to show it in another way.

    As you say:

    SSL certificates are automatically applied on WordPress.com

    so when I access site through http:

    $ curl -v http://rychlofky.cz/feed/                                                    
    *   Trying 192.0.78.24:80...
    * Connected to rychlofky.cz (192.0.78.24) port 80 (#0)
    > GET /feed/ HTTP/1.1
    > Host: rychlofky.cz
    > User-Agent: curl/7.71.1
    > Accept: */*
    > 
    * Mark bundle as not supporting multiuse
    < HTTP/1.1 301 Moved Permanently
    < Server: nginx
    < Date: Thu, 04 Mar 2021 11:20:38 GMT
    < Content-Type: text/html
    < Content-Length: 162
    < Connection: keep-alive
    < Location: https://rychlofky.cz/feed/
    < X-ac: 2.hhn _dca 
    < 
    <html>
    <head><title>301 Moved Permanently</title></head>
    <body>
    <center><h1>301 Moved Permanently</h1></center>
    <hr><center>nginx</center>
    </body>
    </html>
    * Connection #0 to host rychlofky.cz left intact
    

    I’m redirected to https.

    On the other hand when I programatically search for http:// entries in the feed I get a lot of links to http://rychlofky.cz:

    
    $ curl -s https://rychlofky.cz/feed/ | grep http://rychlofky.cz     
    	<atom:link href="http://rychlofky.cz/feed/" rel="self" type="application/rss+xml" />
    	<link>http://rychlofky.cz</link>
    		<link>http://rychlofky.cz</link>
    	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://rychlofky.cz/osd.xml" title="rychlofky" />
    	<atom:link rel='hub' href='http://rychlofky.cz/?pushpress=hub'/>
    		<link>http://rychlofky.cz/2021/03/03/google-oznamil-flutter-2/</link>
    					<comments>http://rychlofky.cz/2021/03/03/google-oznamil-flutter-2/#respond</comments>
    					<description><![CDATA[Google oznámil Flutter 2. Novou verzi nástroje na vytváření přenositelných aplikaci pro jakoukoliv platformu (iOS, Android, Windows, Mac OS a Linux, plus webové podoby pro Chrome, Firefox, Safari či Edge).<span class="excerpt-more-link"><a class="more-link" href="http://rychlofky.cz/2021/03/03/google-oznamil-flutter-2/">Další <svg class="svg-icon" width="24" height="24" aria-hidden="true" role="img" focusable="false" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill-rule="evenodd" clip-rule="evenodd" d="M6.96954 10.2804L11.9999 15.3107L17.0302 10.2804L15.9695 9.21973L11.9999 13.1894L8.0302 9.21973L6.96954 10.2804Z" fill="currentColor"/></svg></a></span>]]></description>
    					<wfw:commentRss>http://rychlofky.cz/2021/03/03/google-oznamil-flutter-2/feed/</wfw:commentRss>
    		<link>http://rychlofky.cz/2021/03/03/bezheslove-prihlasovani-bude-novou-soucasti-microsoft-azure-active-directory/</link>
    					<comments>http://rychlofky.cz/2021/03/03/bezheslove-prihlasovani-bude-novou-soucasti-microsoft-azure-active-directory/#respond</comments>
    					<description><![CDATA[Bezheslové přihlašování bude novou součástí Microsoft Azure Active Directory. Odzkoušeno je už zhruba 200 miliony lidmi. Novinkou i Temporary Access Pass umožňujeí snazší vstup do nové služby bez generování hesel. Samozřejmě aby něco takového fungovalo, budete potřebovat nějaký ten kousek železa co podporuje standardy od FIDO Alliance.<span class="excerpt-more-link"><a class="more-link" href="http://rychlofky.cz/2021/03/03/bezheslove-prihlasovani-bude-novou-soucasti-microsoft-azure-active-directory/">Další <svg class="svg-icon" width="24" height="24" aria-hidden="true" role="img" focusable="false" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill-rule="evenodd" clip-rule="evenodd" d="M6.96954 10.2804L11.9999 15.3107L17.0302 10.2804L15.9695 9.21973L11.9999 13.1894L8.0302 9.21973L6.96954 10.2804Z" fill="currentColor"/></svg></a></span>]]></description>
    					<wfw:commentRss>http://rychlofky.cz/2021/03/03/bezheslove-prihlasovani-bude-novou-soucasti-microsoft-azure-active-directory/feed/</wfw:commentRss>
    		<link>http://rychlofky.cz/2021/03/03/power-fx-je-novy-open-source-low-code-programovaci-jazyk-od-microsoftu/</link>
    					<comments>http://rychlofky.cz/2021/03/03/power-fx-je-novy-open-source-low-code-programovaci-jazyk-od-microsoftu/#respond</comments>
    					<description><![CDATA[Power Fx je nový open source low-code programovací jazyk od Microsoftu. Kořeny má ve vzorcích v Excelu a měl by se stát základem pro Microsoft Power Platform (Power Apps) prostředí. Domovem je v microsoft/Power-Fx na GitHubu a má i blog, viz Introducing Microsoft Power Fx: the low-code programming language for everyone<span class="excerpt-more-link"><a class="more-link" href="http://rychlofky.cz/2021/03/03/power-fx-je-novy-open-source-low-code-programovaci-jazyk-od-microsoftu/">Další <svg class="svg-icon" width="24" height="24" aria-hidden="true" role="img" focusable="false" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill-rule="evenodd" clip-rule="evenodd" d="M6.96954 10.2804L11.9999 15.3107L17.0302 10.2804L15.9695 9.21973L11.9999 13.1894L8.0302 9.21973L6.96954 10.2804Z" fill="currentColor"/></svg></a></span>]]></description>
    					<wfw:commentRss>http://rychlofky.cz/2021/03/03/power-fx-je-novy-open-source-low-code-programovaci-jazyk-od-microsoftu/feed/</wfw:commentRss>
    		<link>http://rychlofky.cz/2021/03/03/sony-playstatio-store-konci-s-pronajmem-i-prodejem-filmu-a-serialu/</link>
    					<comments>http://rychlofky.cz/2021/03/03/sony-playstatio-store-konci-s-pronajmem-i-prodejem-filmu-a-serialu/#respond</comments>
    					<description><![CDATA[Sony PlayStatio Store končí s pronájmem i prodejem filmů a seriálů k 31. srpnu 2021 po více než dekádě snah uchytit se na trhu s filmy a seriály. Uvádějí, že uživatelé PlayStation více a více využívají zdarma nebo za předplatné dostupné video streamingové služby.<span class="excerpt-more-link"><a class="more-link" href="http://rychlofky.cz/2021/03/03/sony-playstatio-store-konci-s-pronajmem-i-prodejem-filmu-a-serialu/">Další <svg class="svg-icon" width="24" height="24" aria-hidden="true" role="img" focusable="false" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill-rule="evenodd" clip-rule="evenodd" d="M6.96954 10.2804L11.9999 15.3107L17.0302 10.2804L15.9695 9.21973L11.9999 13.1894L8.0302 9.21973L6.96954 10.2804Z" fill="currentColor"/></svg></a></span>]]></description>
    					<wfw:commentRss>http://rychlofky.cz/2021/03/03/sony-playstatio-store-konci-s-pronajmem-i-prodejem-filmu-a-serialu/feed/</wfw:commentRss>
    		<link>http://rychlofky.cz/2021/03/03/valheim-prodal-5-milionu-kopii-za-jediny-mesic-na-steamu/</link>
    					<comments>http://rychlofky.cz/2021/03/03/valheim-prodal-5-milionu-kopii-za-jediny-mesic-na-steamu/#respond</comments>
    					<description><![CDATA[Valheim prodal 5 milionů kopií za jediný měsíc na Steamu. Vývojář Iron Gate Studio hru prodává za 19.99 USD, takže celkové tržby jsou 100 milionů dolarů. Něco samozřejmě zůstane Valve, ale stále je to skvělý úspěch. Hru studio vyvinulo v pěti lidech a v Early Acess na Steamu se velmi rychle dostala na pět navíce…<span class="excerpt-more-link"><a class="more-link" href="http://rychlofky.cz/2021/03/03/valheim-prodal-5-milionu-kopii-za-jediny-mesic-na-steamu/">Další <svg class="svg-icon" width="24" height="24" aria-hidden="true" role="img" focusable="false" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill-rule="evenodd" clip-rule="evenodd" d="M6.96954 10.2804L11.9999 15.3107L17.0302 10.2804L15.9695 9.21973L11.9999 13.1894L8.0302 9.21973L6.96954 10.2804Z" fill="currentColor"/></svg></a></span>]]></description>
    					<wfw:commentRss>http://rychlofky.cz/2021/03/03/valheim-prodal-5-milionu-kopii-za-jediny-mesic-na-steamu/feed/</wfw:commentRss>
    		<link>http://rychlofky.cz/2021/03/03/google-po-konci-cookies-tretich-stran-neplanuje-novou-sledovaci-technologii/</link>
    					<comments>http://rychlofky.cz/2021/03/03/google-po-konci-cookies-tretich-stran-neplanuje-novou-sledovaci-technologii/#respond</comments>
    					<description><![CDATA[Google po konci cookies třetích stran neplánuje novou sledovací technologii. Nebude se tedy snažit najít alternativu, který by umožnila jednotlivce sledovat napříč webem. V oznámení také dodávají, že inzeráty bude stále možné cílit a současné technologie k tomu nabízejí dostatek příležitostí. Nutno dodat, že ne všechny jsou z pohledu ochrany soukromí zcela košer. Detaily od…<span class="excerpt-more-link"><a class="more-link" href="http://rychlofky.cz/2021/03/03/google-po-konci-cookies-tretich-stran-neplanuje-novou-sledovaci-technologii/">Další <svg class="svg-icon" width="24" height="24" aria-hidden="true" role="img" focusable="false" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill-rule="evenodd" clip-rule="evenodd" d="M6.96954 10.2804L11.9999 15.3107L17.0302 10.2804L15.9695 9.21973L11.9999 13.1894L8.0302 9.21973L6.96954 10.2804Z" fill="currentColor"/></svg></a></span>]]></description>
    					<wfw:commentRss>http://rychlofky.cz/2021/03/03/google-po-konci-cookies-tretich-stran-neplanuje-novou-sledovaci-technologii/feed/</wfw:commentRss>
    		<link>http://rychlofky.cz/2021/03/02/jsou-prohlizece-od-xiaomi-spyware/</link>
    					<comments>http://rychlofky.cz/2021/03/02/jsou-prohlizece-od-xiaomi-spyware/#respond</comments>
    					<description><![CDATA[Jsou prohlížeče od Xiaomi spyware? Někdy v dubnu 2020 se toto téma objevilo v americkém Forbesu (Exclusive: Warning Over Chinese Mobile Giant Xiaomi Recording Millions Of People’s ‘Private’ Web And Phone Use) a tradičně to skončilo tvrzením ve Forbesu vs. popírání od Xiaomi. Nově se můžete v Are Xiaomi browsers spyware? Yes, they are… podívat…<span class="excerpt-more-link"><a class="more-link" href="http://rychlofky.cz/2021/03/02/jsou-prohlizece-od-xiaomi-spyware/">Další <svg class="svg-icon" width="24" height="24" aria-hidden="true" role="img" focusable="false" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill-rule="evenodd" clip-rule="evenodd" d="M6.96954 10.2804L11.9999 15.3107L17.0302 10.2804L15.9695 9.21973L11.9999 13.1894L8.0302 9.21973L6.96954 10.2804Z" fill="currentColor"/></svg></a></span>]]></description>
    					<wfw:commentRss>http://rychlofky.cz/2021/03/02/jsou-prohlizece-od-xiaomi-spyware/feed/</wfw:commentRss>
    		<link>http://rychlofky.cz/2021/03/02/microsoft-predstavil-mesh-mixed-reality-ma-byt-neco-jako-virtualni-budoucnost-pro-microsoft-teams/</link>
    					<comments>http://rychlofky.cz/2021/03/02/microsoft-predstavil-mesh-mixed-reality-ma-byt-neco-jako-virtualni-budoucnost-pro-microsoft-teams/#respond</comments>
    					<description><![CDATA[Microsoft představil Mesh. „Mixed reality“ má být něco jako virtuální budoucnost pro Microsoft Teams. Mají k tomu nádherné video (Introducing Microsoft Mesh) co vypadá jako něco ze Star Treku. A pak se podíváte na realitu, kde se stále vyskytuje v něčem co vypadá jako Second Life posledních deset let. Detaily v Microsoft Mesh feels like…<span class="excerpt-more-link"><a class="more-link" href="http://rychlofky.cz/2021/03/02/microsoft-predstavil-mesh-mixed-reality-ma-byt-neco-jako-virtualni-budoucnost-pro-microsoft-teams/">Další <svg class="svg-icon" width="24" height="24" aria-hidden="true" role="img" focusable="false" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill-rule="evenodd" clip-rule="evenodd" d="M6.96954 10.2804L11.9999 15.3107L17.0302 10.2804L15.9695 9.21973L11.9999 13.1894L8.0302 9.21973L6.96954 10.2804Z" fill="currentColor"/></svg></a></span>]]></description>
    					<wfw:commentRss>http://rychlofky.cz/2021/03/02/microsoft-predstavil-mesh-mixed-reality-ma-byt-neco-jako-virtualni-budoucnost-pro-microsoft-teams/feed/</wfw:commentRss>
    		<link>http://rychlofky.cz/2021/03/01/instagram-spustil-live-rooms-zive-vysilani-az-ctyr-lidi-soucasne/</link>
    					<comments>http://rychlofky.cz/2021/03/01/instagram-spustil-live-rooms-zive-vysilani-az-ctyr-lidi-soucasne/#respond</comments>
    					<description><![CDATA[Instagram spustil Live Rooms. Živé vysílání až čtyř lidí současně. Užitečný formát pro rozhovory, talk show, hudebníky a další formáty Doposud bylo možné živě vysílat v jednom či ve dvou lidech. Detaily v Doubling Up on Instagram Live With Live Rooms<span class="excerpt-more-link"><a class="more-link" href="http://rychlofky.cz/2021/03/01/instagram-spustil-live-rooms-zive-vysilani-az-ctyr-lidi-soucasne/">Další <svg class="svg-icon" width="24" height="24" aria-hidden="true" role="img" focusable="false" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill-rule="evenodd" clip-rule="evenodd" d="M6.96954 10.2804L11.9999 15.3107L17.0302 10.2804L15.9695 9.21973L11.9999 13.1894L8.0302 9.21973L6.96954 10.2804Z" fill="currentColor"/></svg></a></span>]]></description>
    					<wfw:commentRss>http://rychlofky.cz/2021/03/01/instagram-spustil-live-rooms-zive-vysilani-az-ctyr-lidi-soucasne/feed/</wfw:commentRss>
    		<link>http://rychlofky.cz/2021/03/01/gab-pry-hacknut-a-ddosecrets-tvrdi-ze-70-gb-hesel-prispevku-a-dat-zverejn/</link>
    					<comments>http://rychlofky.cz/2021/03/01/gab-pry-hacknut-a-ddosecrets-tvrdi-ze-70-gb-hesel-prispevku-a-dat-zverejn/#respond</comments>
    					<description><![CDATA[Gab prý hacknut a DDoSecrets tvrdí, že 70 GB hesel, příspěvků a dat zveřejní pro výzkumníky, novináře a vědce. Gab je jedním z krajně pravicových webů a útočištěm těch co si jsou přesvědčení, že Donald Trump vyhrál volby a svět je plný konspirací. Gab (opět) tvrdí, že nebyl hacknut. V tom samém prohlášení ale uvádí,…<span class="excerpt-more-link"><a class="more-link" href="http://rychlofky.cz/2021/03/01/gab-pry-hacknut-a-ddosecrets-tvrdi-ze-70-gb-hesel-prispevku-a-dat-zverejn/">Další <svg class="svg-icon" width="24" height="24" aria-hidden="true" role="img" focusable="false" viewBox="0 0 24 24" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill-rule="evenodd" clip-rule="evenodd" d="M6.96954 10.2804L11.9999 15.3107L17.0302 10.2804L15.9695 9.21973L11.9999 13.1894L8.0302 9.21973L6.96954 10.2804Z" fill="currentColor"/></svg></a></span>]]></description>
    					<wfw:commentRss>http://rychlofky.cz/2021/03/01/gab-pry-hacknut-a-ddosecrets-tvrdi-ze-70-gb-hesel-prispevku-a-dat-zverejn/feed/</wfw:commentRss>
    
  • Hello there,

    Many thanks for providing that information.

    What seems to be the issue you’re having with the feed apart from the http/https differences described there?

    The reason why I ask is because the feed is valid and should work just fine in any RSS reader.

    Many thanks in advance for any/all context that can be provided.

  • Yes, feed can be parsed just fine. But that’s not the issue here. I’m going to describe the timeline of hypothetical RSS reader.

    1. Reader fetches a feed.
    2. Parses all items and stores their unique IDs – values in <guid>. Let’s say it finds: <guid isPermaLink="false">https://rychlofky.cz/?p=36271</guid>
    3. Some time later it fetches the feed again
    4. This time feed contains links to http instead of https. It finds item with id: <guid isPermaLink="false">http://rychlofky.cz/?p=36271</guid> and treats that as a separate item (URL is different).

    The issue I’m trying to describe is that *sometimes* wordpress.com generates links with http and some other time it creates links to https. And that make RSS reader to duplicate entries – because they contain different data.

  • Can you give us an example where your feed is pulling duplicate entries? A link to a feed if it’s in the WordPress.com Reader or a screenshot of the behavior so we can try to replicate the issue?

    Also let us know what feed reader you’re using to serve your RSS feed if it’s not the WordPress.com Reader. Once we have that information, we’ll be happy to take a closer look.

    Thanks!

  • I have mentioned that the second comment already – I’m using https://inoreader.com/

    No, I’m not using WordPress.com Reader as RSS reader.

  • Hm, that doesn’t make any sense. Our redirect from http:// to https:// is a 301 Moved Permanently, so http:// shouldn’t be accessible.

    Are you sure that your feed is saved at https://rychlofky.cz/feed/ in their reader, not http://?

    If so, you may need to contact their support about this, as I’m unable to reproduce the issue over time in my feed readers.

  • Frankly, what doesn’t make sense to me is that neither of you so far acknowledged that I’m not referring to URL of the the feed, nor site URL but content of the feed that is generated by wordpress.com.

    Today at 12:24 pm#3643467 I showed you feed content where lines referenced to http:// in tags <link>, <comments>, <description>, …, <wfw:commentRss>. In particular that means <guid> contained https://.

    Couple hours later and feed is again regenerated differently and now only <guid> tag contains http:// and all the other contains https:// :

    $ curl -s https://rychlofky.cz/feed/ | grep http://rychlofky 
    		<guid isPermaLink="false">http://rychlofky.cz/?p=36321</guid>
    		<guid isPermaLink="false">http://rychlofky.cz/?p=36318</guid>
    		<guid isPermaLink="false">http://rychlofky.cz/?p=36314</guid>
    		<guid isPermaLink="false">http://rychlofky.cz/?p=36311</guid>
    		<guid isPermaLink="false">http://rychlofky.cz/?p=36308</guid>
    		<guid isPermaLink="false">http://rychlofky.cz/?p=36305</guid>
    		<guid isPermaLink="false">http://rychlofky.cz/?p=36301</guid>
    		<guid isPermaLink="false">http://rychlofky.cz/?p=36297</guid>
    		<guid isPermaLink="false">http://rychlofky.cz/?p=36294</guid>
    		<guid isPermaLink="false">http://rychlofky.cz/?p=36290</guid>
    $

    Do you see difference between *content*?

    If you would like to reproduce that error, keep that feed in the reader for a day or so (it seems these two generators alternate once a day or so) and you should see that.

  • Hello there,

    Many thanks for that detailed explanation of events.

    From our side, we are unable to replicate that switch in feed content from http to https – but that doesn’t mean it’s not happening. The fact that the switch is appearing every 24 hours approx, suggests it may be a DNS issue.

    I see the domain is being mapped from a domain registrar. When a domain is mapped, an SSL is automatically applied, and the http > https redirect is applied.

    I’d like to rule out anything with the domain mapping that may be causing that, so can you confirm the following please:

    1. Can you check that only one (and not both) of the following DNS setting are applied to map the domain:

    Nameservers

    
    ns1.wordpress.com
    ns2.wordpress.com
    ns3.wordpress.com
    

    or

    A records

    
    192.0.78.24
    192.0.78.25 
    

    This will rule out any doubling up of DNS settings, where the redirect is applied and redirected again.

    2. I ran the URL through Why No Padlock

    https://www.whynopadlock.com/results/94a1609e-135c-4adf-867e-f167077de4ec

    Are you able to confirm if TLSv1 is enabled with the domain registrar.

    3. Lastly, I notice there’s a Facebook TXT verification record is in place, would it be viable to temporarily remove this and see if this switch happens please?

    4. This step is optional, but would point the finger toward the domain mapping set up, if you’re able to confirm if this is happening without https://rychlofky.cz/ as the primary domain, that could also be useful information. Just to reassure you, the https://rychlofky.cz/ would still redirect to the default WordPress.com domain, so you wouldn’t lose any traffic there.

    5. Can you get https://inoreader.com/ to explain why they’re duplicating content, when the feed is valid and http does redirect to https please.

    I not that seems a lot, but any and all information would be useful in getting to bottom of why that is happening with the domain.

    Many thanks in advance.

  • You have asked for an example, here it is.

    I have the feed in Inoreader and marked all entries as read except duplicates, just to speed up scrolling. Then I copied adresses of both items into an editor so you can see the difference – one points to http:// and second one to https://

    https://photos.app.goo.gl/knkrSm8Yqk2Fi9NT8

    You may think that this is an issue with Innoreader. In this video I show you that I have feed with https:// address and then I search for guid entry. Here I struggle with pretty big xml on my phone but at the end you’ll see that item(s) are identified with http:// addresses. And just to point out this feed was generated by wordpress.com and available under http:// and the site is also available under https://.

    https://photos.app.goo.gl/KhzP5fHJ5AXdceMZA

    Couple hours later I have recorded similar video – where I load https:// feed again and search for guid. This time entries are referenced with https:// addresses.

    https://photos.app.goo.gl/VhyT2GgBTnuKYJLW7

    Why am I pointing to guid? Here is an short article that describes what RSS readers have to go through when they doesn’t find guid at all. http://www.xn--8ws00zhy3a.com/blog/2006/08/rss-dup-detection

    In the first video you can actually see that Innoreader detected a duplicate entry – most likely by destination addresses, title, text or other fields that are kept the same.

    Finally in the last post you have suggested to look at https://www.whynopadlock.com/results/94a1609e-135c-4adf-867e-f167077de4ec – and I did. And I have noticed that page listed other domains that matched the certificate.

    I have looked at the very first site – ryangardenphotography.com – which is (by accident?) hosted by wordpress.com and yes, this has also TLS enabled and available under https:// (https://www.whynopadlock.com/results/79a63286-78fc-4b8c-a4c4-93f4265acf2b but that is not important right now). How does their feed https://ryangardenphotography.com/feed/ look?

    At the very beginning of the video you can see that the feed is generated for site available under https:// , also an item in the feed links to https:// , but guid points to http://.

    https://photos.app.goo.gl/saMagUY1SPv3yyCX8

    Does that mean the other site is misconfigured as well? Well, I’d argue that WordPress.com generates guids with http:// and from time to time some other job generates guids with https://. I cannot show you that on this random site because it doesn’t seem to be updated often enough – last update is from February.

  • Hello there,

    Many thanks for getting back to us and taking the time to create those videos, any and all information is useful as we work through this.

    May I ask you to carry out points 1, 3, 4 and 5 of the previous response please.

    It’s best to not leave any stone unturned :) .

    Many thanks in advance and looking forward to hearing the results.

  • I’m afraid that I’m not able to do points 1, 3-4 that you have listed. I’m not owner of the site nor do I have admin access.

    Except of the fifth point which does not even make sense to me – I don’t think the feed is invalid. Nor that the problem would be with http redirecting to https. The issue is that in the feed itself for the same entry identifier (in guid tag) alternates between http:// and https://. And because those two values aren’t the same, Innoreader (from my point correctly) cannot say if that’s expected or not and only warns user that there might be a duplicate – see my very first video in previous to last post. That’s why I don’t see a reason to report this to Innoreader.

    I’m sorry for few days silence. Since then I have created an testing site where I’m no able to completely reproduce alternating feed content. All feeds use guids that point to http:// – even though the rest of the site is available to https://.

    You can see it here: https://testsite115163463.wordpress.com/blog/feed/ where I have created RSS entry like this:


    <item>
    <title>test9</title>
    <link>https://testsite115163463.wordpress.com/2021/03/09/test9/</link>
    <comments>https://testsite115163463.wordpress.com/2021/03/09/test9/#respond</comments>

    <dc:creator><![CDATA[scroolik]]></dc:creator>
    <pubDate>Tue, 09 Mar 2021 21:55:56 +0000</pubDate>
    <category><![CDATA[Uncategorized]]></category>
    <guid isPermaLink="false">http://testsite115163463.wordpress.com/?p=34</guid>

    <description><![CDATA[test9]]></description>
    <content:encoded><![CDATA[
    <p>test9</p>
    ]]></content:encoded>

    <wfw:commentRss>https://testsite115163463.wordpress.com/2021/03/09/test9/feed/</wfw:commentRss>
    <slash:comments>0</slash:comments>

    <media:content url="https://0.gravatar.com/avatar/c64cfbb9854ec9a9470521fd134fc541?s=96&d=identicon&r=G" medium="image">
    <media:title type="html">scroolik</media:title>
    </media:content>
    </item>

    The surprising part is:

    <guid isPermaLink="false">http://testsite115163463.wordpress.com/?p=34</guid>

  • Hello,

    Many thanks for getting back to me, and it’s shame you can’t rule out those other points.

    That’s why I don’t see a reason to report this to Innoreader.

    You really should reconsider this, the reason being is.

    1. As there is only one site, there is only one version of the content.
    2. 301 redirects are automatically applied for http > https, regardless of what’s being shown in the feed. This means readers should only be showing one copy of the content (which most do.)
    3. As you say, the feed is valid and functioning – it’s the reader which is the problem here.

    I hope this helps you and managed to get this resolved with Innoreader.

  • You really should reconsider this, the reason being is.

    1. As there is only one site, there is only one version of the content.

    I agree with the first half of the statement. I do agree with the second one as long as you are referring to html pages served by WordPress.com.

    I have showed you above that there is not one version of RSS feed that WordPress.com produces. There are two formats of the RSS feed that alternates.

    2. 301 redirects are automatically applied for http > https, regardless of what’s being shown in the feed. This means readers should only be showing one copy of the content (which most do.)

    I do agree with the first part of the statement here again.

    I’m not sure about the second one when you say that most of the readers behave as WordPress.com one. I can only say that Innoreader does not and I see pretty good reason for that.

    3. As you say, the feed is valid and functioning – it’s the reader which is the problem here.

    Here again – I haven’t disputed validity of the feed. I do not agree with the stability of it or functionality. And I do not agree with your statement that the reader is the problem here.

    Innoreader is an example of the service that interprets feed in pretty much correct way. Why? WordPress.com generates different <guids> for the very same entry.

    I see that you point out that contents of the <guid> element in the feed is an URL. And *if* the reader detects that as an URL *and* follows it, it will end up on the very same site, served through https://.

    But that’s not how RSS readers are required to work. Let’s see how guid is specified in RSS standard: https://validator.w3.org/feed/docs/rss2.html or https://www.rssboard.org/rss-specification :

    guid stands for globally unique identifier. It’s a string that uniquely identifies the item. When present, an aggregator may choose to use this string to determine if an item is new.

    <guid>http://some.server.com/weblogItem3207</guid&gt;

    There are no rules for the syntax of a guid. Aggregators must view them as a string. It’s up to the source of the feed to establish the uniqueness of the string.

    If the guid element has an attribute named “isPermaLink” with a value of true, the reader may assume that it is a permalink to the item, that is, a url that can be opened in a Web browser, that points to the full item described by the <item> element. An example:

    <guid isPermaLink=”true”>http://inessential.com/2002/09/01.php#a2</guid&gt;

    isPermaLink is optional, its default value is true. If its value is false, the guid may not be assumed to be a url, or a url to anything in particular.

    I’d like to emphasize “Aggregators must view them as a string.” – even though it looks like URL, WordPress.com sets isPermaLink=”false” for this element so last sentence of the quote applies.

    Then from the same page:

    it’s recommended that you provide the guid, and if possible make it a permalink. This enables aggregators to not repeat items, even if there have been editing changes.

    As you can see, guid could be used to (de)duplicate entries. And as I wrote above – entries are sometimes strings that starts with http:// and other time with https:// they are (and should) be treated as separate, duplicate entries.

    And just to show you that <guid> doesn’t need to be necessarily URL, see example in here: https://en.wikipedia.org/wiki/RSS#Example . Hopefully nobody sane would try to consider that UUID as URL. That’s because attribute isPermaLink is correctly set to “false”.

  • Thanks for the writeup! I have reported this to our developers, but cannot provide an ETA for a fix, as we have several higher priority items at the moment.

    I agree our feeds should not be working this way, but your feed reader also should not be working this way (and we have been unable to reproduce the issue in other feed readers), so I also recommend contacting your feed reader’s support. It’s possible a fix on their end will be completed sooner than ours.

  • The topic ‘RSS feed sometimes contains http and other time https links’ is closed to new replies.